POI-HMEF - Java API To Access Microsoft Transport Neutral Encoding Files (TNEF)(POI-HMEF - 访问 Microsoft 传输中性编码文件 (TNEF) 的 Java API)

Overview(概述)

Overview(概述)

HMEF is the POI Project's pure Java implementation of Microsoft's TNEF (Transport Neutral Encoding Format), aka winmail.dat, which is used by Outlook and Exchange in some situations.(HMEF 是 POI 项目的 Microsoft TNEF(传输中性编码格式)的纯 Java 实现,又名 winmail.dat,Outlook 和 Exchange可在某些情况下使用它。)

Currently, HMEF provides a read-only api for accessing common message and attachment attributes, including the message body and attachment files. In addition, it's possible to have read-only access to all of the underlying TNEF and MAPI attributes of the message and attachments.(目前,HMEF 提供了一个只读的 api,用于访问常见的消息和附件属性,包括消息正文和附件文件。此外,还可以只读访问邮件和附件的所有基础 TNEF 和 MAPI 属性。)

HMEF also provides a command line tool for extracting out the message body and attachment files from a TNEF (winmail.dat) file.(HMEF 还提供了一个命令行工具,用于从 TNEF (winmail.dat) 文件中提取邮件正文和附件文件。)

Write support, both for saving changes and for creating new files, is currently unavailable. Anyone interested in working on these areas is advised to read the Contribution Guidelines then join the dev list!(目前不提供用于保存更改和创建新文件的写入支持。建议对这些领域工作感兴趣的人阅读贡献指南,然后加入开发列表!)

Note (注意)
This code currently lives the scratchpad area of the POI SVN repository. To use this component, ensure you have the Scratchpad Jar on your classpath, or a dependency defined on the poi-scratchpad artifact - the main POI jar is not enough! See the POI Components Map for more details. (此代码当前位于 POI SVN 存储库的暂存区。要使用这个组件,请确保您的类路径中有 Scratchpad Jar,或者在 poi-scratchpad 工件上定义了依赖项 - 主 POI jar 是不够的!有关详细信息,请参阅 POI 组件图。)

Using HMEF to access TNEF (winmail.dat) files(使用 HMEF 访问 TNEF (winmail.dat) 文件)

Easy extraction of message body and attachment files(轻松提取邮件正文和附件文件)

The class org.apache.poi.hmef.extractor.HMEFContentsExtractor provides both command line and Java extraction. It allows the saving of the message body (an RTF file), and all of the attachment files, to a single directory as specified.(org.apache.poi.hmef.extractor.HMEFContentsExtractor 类提供命令行和Java 提取。它允许将消息正文(RTF 文件)和所有附件文件保存到指定的单个目录中。)

From the command line, simply call the class specifying the TNEF file to extract, and the directory to place the extracted files into, eg:(在命令行中,只需调用指定要提取的 TNEF 文件的类,以及将提取的文件放入的目录,例如:)

java -classpath poi-3.14.jar:poi-scratchpad-3.14.jar org.apache.poi.hmef.extractor.HMEFContentsExtractor winmail.dat /tmp/extracted/

From Java, there are two method calls on the class, one to extract the message body RTF to a file, and the other to extract all the attachments to a directory. A typical use would be:(在 Java 中,对该类有两种方法调用,一种是将消息体 RTF 提取到文件中,另一种是将所有附件提取到目录中。一个典型的用途是:)

public void extract(String winmailFilename, String directoryName) throws Exception {
HMEFContentsExtractor ext = new HMEFContentsExtractor(new File(winmailFilename));
File dir = new File(directoryName);
File rtf = new File(dir, "message.rtf");
if(! dir.exists()) {
throw new FileNotFoundException("Output directory " + dir.getName() + " not found");
}
System.out.println("Extracting...");
ext.extractMessageBody(rtf);
ext.extractAttachments(dir);
System.out.println("Extraction completed");
}

Attachment attributes and contents(附件属性和内容)

To get at your attachments, simply call the getAttachments() method on a HMEFMessage instance, and you'll receive a list of all the attachments.(要获取您的附件,只需在 HMEFMessage 实例上调用 getAttachments() 方法,您将收到所有附件的列表。)

When you have a org.apache.poi.hmef.Attachment object, there are several helper methods available. These will all return the value of the appropriate underlying attachment attributes, or null if for some reason the attribute isn't present in your file.(当你有一个 org.apache.poi.hmef.Attachment 对象时,有几个辅助方法可用。这些都将返回适当的基础附件属性的值,如果由于某种原因该属性在您的文件中不存在,则返回 null。)

  • getFilename() - returns the name of the attachment file, possibly in 8.3 format(getFilename() - 返回附件文件的名称,可能是 8.3 格式)
  • getLongFilename() - returns the full name of the attachment file(getLongFilename() - 返回附件文件的全名)
  • getExtension() - returns the extension of the attachment file, including the "."(getExtension() - 返回附件文件的扩展名,包括“.”)
  • getModifiedDate() - returns the date that the attachment file was last edited on(getModifiedDate() - 返回上次编辑附件文件的日期)
  • getContents() - returns a byte array of the contents of the attached file(getContents() - 返回附件内容的字节数组)
  • getRenderedMetaFile() - returns a byte array of a windows meta file representation of the attached file(getRenderedMetaFile() - 返回附件的 Windows 元文件表示的字节数组)

Message attributes and message body(消息属性和消息体)

A org.apache.poi.hmef.HMEFMessage instance is created from an InputStream of the underlying TNEF (winmail.dat) file.(org.apache.poi.hmef.HMEFMessage 实例是从底层 TNEF (winmail.dat) 文件的 InputStream 创建的。)

From a HMEFMessage, there are three main methods of interest to call:(从 HMEFMessage 中,可以调用三个主要感兴趣的方法:)

  • getBody() - returns a String containing the RTF contents of the message body.(getBody() - 返回一个包含消息正文的 RTF 内容的字符串。)
  • getSubject() - returns the message subject(getSubject() - 返回消息主题)
  • getAttachments() - returns the list of Attachment objects for the message(getAttachments() - 返回消息的附件对象列表)

Low level attribute access(低级属性访问)

Both Messages and Attachments contain two kinds of attributes. These are TNEFAttribute and MAPIAttribute.(消息和附件都包含两种属性。它们是 TNEFAttribute 和 MAPIAttribute。)

TNEFAttribute is specific to TNEF files in terms of the available types and properties. In general, Attachments have a few more useful ones of these then Messages.(就可用类型和属性而言,TNEFAttribute 特定于 TNEF 文件。一般来说,附件有一些比消息更有用的东西。)

MAPIAttributes hold standard MAPI properties and values, and work in a similar way to HSMF (Outlook) does. There are typically many of these on both Messages and Attachments. Note - see limitations(MAPIAttributes 保存标准 MAPI 属性和值,并以与 HSMF (Outlook) 类似的方式工作。消息和附件中通常有许多这样的内容。注意 - 见局限性)

Both HMEFMessage and Attachment supports support two different ways of getting to attributes of interest. Firstly, they support list getters, to return all attributes (either TNEF or MAPI). Secondly, they support specific getters by TNEF or MAPI property.(HMEFMessage 和 Attachment 都支持两种不同的方式来获取感兴趣的属性。首先,它们支持列表获取器,以返回所有属性(TNEF 或 MAPI)。其次,它们通过 TNEF 或 MAPI 属性支持特定的 getter。)

HMEFMessage msg = new HMEFMessage(new FileInputStream(file));
for(TNEFAttribute attr : msg.getMessageAttributes()) {
System.out.println("TNEF : " + attr);
}
for(MAPIAttribute attr : msg.getMessageMAPIAttributes()) {
System.out.println("MAPI : " + attr);
}
System.out.println("Subject is " + msg.getMessageMAPIAttribute(MAPIProperty.CONVERSATION_TOPIC));
for(Attachment attach : msg.getAttachments()) {
for(TNEFAttribute attr : attach.getAttributes()) {
System.out.println("A.TNEF : " + attr);
}
for(MAPIAttribute attr : attach.getMAPIAttributes()) {
System.out.println("A.MAPI : " + attr);
}
System.out.println("Filename is " + attach.getAttribute(TNEFProperty.ID_ATTACHTITLE));
System.out.println("Extension is " + attach.getMAPIAttribute(MAPIProperty.ATTACH_EXTENSION));
}

Investigating a TNEF file(调查 TNEF 文件)

To get a feel for the contents of a file, and to track down where data of interest is stored, HMEF comes with HMEFDumper to print out the contents of the file.(为了了解文件的内容,并追踪感兴趣的数据的存储位置,附带了 HMEFumper 的HMEF来打印出文件的内容。)

Limitations(局限性)

HMEF is currently a work-in-progress, and not everything works yet. The current limitations are:(HMEF 目前正在进行中,并非一切正常。目前的局限是:)

  • Non-standard MAPI properties from the range 0x8000 to 0x8fff may not be being quite correctly turned into attributes. The values show up, but the name and type may not always be correct.(从 0x8000 到 0x8fff 范围内的非标准 MAPI 属性可能无法完全正确地转换为属性。值会显示,但名称和类型可能并不总是正确的。)
  • All testing so far has been performed on a small number of English documents. We think we're correctly turning bytes into Java unicode strings, but we need a few non-English sample files in the test suite to verify this!(到目前为止,所有测试都是在少量英文文档上进行的。我们认为我们正确地将字节转换为 Java unicode 字符串,但我们需要测试套件中的一些非英语示例文件来验证这一点!)
  • There is no support for saving changes, nor for creating new files(不支持保存更改,也不支持创建新文件)

by Nick Burch(通过尼克伯奇)

 
中英文 | 中文 | 英文