Apache POI - Component Overview(Apache POI - 组件概述)

Apache POI Project Components(Apache POI 项目组件)

The Apache POI project is the master project for developing pure Java ports of file formats based on Microsoft's OLE 2 Compound Document Format. OLE 2 Compound Document Format is used by Microsoft Office Documents, as well as by programs using MFC property sets to serialize their document objects.(Apache POI项目是用于基于Microsoft的OLE 2复合文档格式开发文件格式的纯Java端口的主项目。 Microsoft Office Documents和使用MFC属性集的程序使用OLE 2复合文档格式来序列化其文档对象)

Apache POI is also the master project for developing pure Java ports of file formats based on Office Open XML (ooxml). OOXML is part of an ECMA / ISO standardisation effort. This documentation is quite large, but you can normally find the bit you need without too much effort! ECMA-376 standard is here, and is also under the Microsoft OSP.(Apache POI项目也是开发基于 Office Open XML (ooxml) 文件格式的纯 Java 端口的主项目。OOXML是ECMA/ISO标准化工作的一部分。该文档很大,但您通常可以轻松找到所需的部分!ECMA-376标准在这,它也在Microsoft OSP里。)

POIFS for OLE 2 Documents(用于OLE 2 文档的 POIFS)

POIFS is the oldest and most stable part of POI. It is our port of the OLE 2 Compound Document Format to pure Java. It supports both read and write functionality. All of our components for the binary (non-XML) Microsoft Office formats ultimately rely on it by definition. Please see the POIFS project page for more information.(POIFS 是 POI 中最古老、最稳定的部分。它是我们开发的OLE 2 复合文档格式到纯Java的端口移植。它支持读取和写入功能。我们所有用于二进制(非 XML)Microsoft Office 格式的组件最终都依赖于它。请参阅 POIFS 项目页面了解更多信息。)

HSSF and XSSF for Excel Documents(用于Excel 文档的 HSSF 和 XSSF)

HSSF is our port of the Microsoft Excel 97 (-2003) file format (BIFF8) to pure Java. XSSF is our port of the Microsoft Excel XML (2007+) file format (OOXML) to pure Java. SS is a package that provides common support for both formats with a common API. They both support read and write capability. Please see the HSSF+XSSF project page for more information.(HSSF是我们开发的Microsoft Excel 97(-2003)文件格式(BIFF8)到纯Java的端口移植。XSSF是我们将Microsoft Excel XML (2007+)文件格式 (OOXML)到纯Java的移植。SS是一个使用通用API为两种格式提供支持的软件包。它们都支持读取和写入,请参阅HSSF+XSSF项目页面了解更多信息。)

HWPF and XWPF for Word Documents(用于 Word 文档的 HWPF 和 XWPF)

HWPF is our port of the Microsoft Word 97 (-2003) file format to pure Java. It supports read, and limited write capabilities. It also provides simple text extraction support for the older Word 6 and Word 95 formats. Please see the HWPF project page for more information. This component remains in early stages of development. It can already read and write simple files.(HWPF是我们开发的Microsoft Word 97 (-2003)文件格式到纯Java的端口移植。它支持读取和有限的写入功能。它还为Word 6和Word 95格式提供简单的文本提取支持。请参阅 HWPF 项目页面了解更多信息。该组件仍处于开发的早期阶段。它已经可以读写简单的文件。)

We are also working on the XWPF for the WordprocessingML (2007+) format from the OOXML specification. This provides read and write support for simpler files, along with text extraction capabilities.(我们还在为 OOXML 规范中的 WordprocessingML (2007+) 格式开发 XWPF。它为更简单的文件提供了读写支持,以及文本提取功能。)

HSLF and XSLF for PowerPoint Documents(用于 PowerPoint 文档的 HSLF 和 XSLF)

HSLF is our port of the Microsoft PowerPoint 97(-2003) file format to pure Java. It supports read and write capabilities. Please see the HSLF project page for more information.(HSLF是我们开发的Microsoft PowerPoint 97(-2003) 文件格式到纯Java的端口移植。它支持读取和写入功能。请参阅HSLF 项目页面了解更多信息。)

We are also working on the XSLF for the PresentationML (2007+) format from the OOXML specification.(我们还在为 OOXML 规范中的 PresentationML (2007+) 格式开发 XSLF。)

HPSF for OLE 2 Document Properties(用于 OLE 2 文档属性的 HPSF)

HPSF is our port of the OLE 2 property set format to pure Java. Property sets are mostly use to store a document's properties (title, author, date of last modification etc.), but they can be used for application-specific purposes as well.(HPSF是我们开发的OLE 2 属性集格式到纯Java的端口移植。属性集主要用于存储文档的属性(标题、作者、最后修改日期等),但它们也可用于特定应用程序。)

HPSF supports both reading and writing of properties.(HPSF 支持读取和写入属性。)

Please see the HPSF project page for more information.(请参阅HPSF项目页面了解更多信息。)

HDGF and XDGF for Visio Documents(用于 Visio 文档的 HDGF 和 XDGF)

HDGF is our port of the Microsoft Visio 97(-2003) file format to pure Java. It currently only supports reading at a very low level, and simple text extraction. Please see the HDGF / Diagram project page for more information.(HDGF是我们开发的Microsoft Visio 97(-2003)文件格式到纯Java的端口移植。它目前只支持非常底层的阅读和简单的文本提取。请参阅HDGF/Diagram项目页面了解更多信息。)

XDGF is our port of the Microsoft Visio XML (.vsdx) file format to pure Java. It has slightly more support than HDGF. Please see the XDGF / Diagram project page for more information.(XDGF是我们开发的Microsoft Visio XML (.vsdx) 文件格式到纯Java的端口移植。它比HDGF有更多的支持。有关更多信息,请参阅XDGF/Diagram项目页面)

HPBF for Publisher Documents(用于Publisher Documents的 HPBF)

HPBF is our port of the Microsoft Publisher 98(-2007) file format to pure Java. It currently only supports reading at a low level for around half of the file parts, and simple text extraction. Please see the HPBF project page for more information.(HPBF是我们开发的Microsoft Publisher 98(-2007) 文件格式到纯 Java的端口移植。它目前仅支持对约一半的文件部分进行低级别读取,以及简单的文本提取。有关详细信息,请参阅HPBF 项目页面)

HMEF for TNEF (winmail.dat) Outlook Attachments(用于 TNEF(winmail.dat) Outlook Attachments 的 HMEF)

HMEF is our port of the Microsoft TNEF (Transport Neutral Encoding Format) file format to pure Java. TNEF is sometimes used by Outlook for encoding the message, and will typically come through as winmail.dat. HMEF currently only supports reading at a low level, but we hope to add text and attachment extraction. Please see the HMEF project page for more information.(HMEF是我们开发的Microsoft TNEF(传输中性编码格式)文件格式到纯 Java的端口移植。TNEF有时会被Outlook用于对邮件进行编码,通常以 winmail.dat 的形式出现。 HMEF 目前只支持低级阅读,但我们希望增加文本和附件提取。请参阅HMEF 项目页面了解更多信息。)

HSMF for Outlook Messages(用于 Outlook Messages的 HSMF)

HSMF is our port of the Microsoft Outlook message file format to pure Java. It currently only some of the textual content of MSG files, and some attachments. Further support and documentation is coming in slowly. For now, users are advised to consult the unit tests for example use. Please see the HSMF project page for more information.(HSMF是我们开发的Microsoft Outlook 消息文件格式到纯 Java的端口移植。它目前只支持一些MSG文件的文字内容,以及一些附件。进一步的支持和文档正在慢慢提供。目前,建议用户参考单元测试示例使用。请参阅HSMF 项目页面了解更多信息。)

Microsoft has recently added the Outlook file format to its OSP. More information is now available making implementing this API an easier task.(微软最近在其 OSP 中添加了 Outlook 文件格式。现在提供了更多信息,使实现此 API 变得更容易。)

Component Map(组件图)

The Apache POI distribution consists of support for many document file formats. This support is provided in several Jar files. Not all of the Jars are needed for every format. The following tables show the relationships between POI components, Maven repository tags, and the project's Jar files.(Apache POI 发行版包含对许多文档文件格式的支持,这些支持由下述表格的Jar文件提供。并非每种格式都需要所有的Jar。下表显示了POI组件、Maven存储库标签和项目的Jar文件之间的关系。)

Component(组件) Application type(应用类型) Maven artifactId(Maven artifactId) Notes(备注)
POIFS(POIFS) OLE2 Filesystem(OLE2 文件系统) poi(poi) Required to work with OLE2 / POIFS based files(需要使用基于OLE2 / POIFS的文件)
HPSF(HPSF) OLE2 Property Sets(OLE2 属性集) poi(poi)  
HSSF(HSSF) Excel XLS(Excel XLS) poi(poi) For HSSF only, if common SS is needed see below(仅适用于 HSSF,如果需要common SS,请参见表格最后一栏)
HSLF(HSLF) PowerPoint PPT(PowerPoint PPT) poi-scratchpad(poi-scratchpad)  
HWPF(HWPF) Word DOC(Word DOC) poi-scratchpad(poi-scratchpad)  
HDGF(HDGF) Visio VSD(Visio VSD) poi-scratchpad(poi-scratchpad)  
HPBF(HPBF) Publisher PUB(Publisher PUB) poi-scratchpad(poi-scratchpad)  
HSMF(HSMF) Outlook MSG(Outlook MSG) poi-scratchpad(poi-scratchpad)  
DDF(DDF) Escher common drawings(Escher通用图纸) poi(poi)  
HWMF(HWMF) WMF drawings(WMF 图纸) poi-scratchpad(poi-scratchpad)  
OpenXML4J(OpenXML4J) OOXML(OOXML) poi-ooxml plus either poi-ooxml-lite or
(poi-ooxml plus和poi-ooxml-lite或者poi-ooxml-full)
See notes below for differences between these options(有关这些选项之间的差异,请参见下文的笔记部分)
XSSF(XSSF) Excel XLSX(Excel XLSX) poi-ooxml(poi-ooxml)  
XSLF(XSLF) PowerPoint PPTX(PowerPoint PPTX) poi-ooxml(poi-ooxml)  
XWPF(XWPF) Word DOCX(Word DOCX) poi-ooxml(poi-ooxml)  
XDGF(XDGF) Visio VSDX(Visio VSDX) poi-ooxml(poi-ooxml)  
Common SL(Common SL) PowerPoint PPT and PPTX(PowerPoint PPT 和 PPTX) poi-scratchpad and poi-ooxml(poi-scratchpad 和 poi-ooxml) SL code is in the core POI jar, but implementations are in poi-scratchpad and poi-ooxml.(SL 代码在核心 POI jar 中,但它的实现在 poi-scratchpad 和 poi-ooxml 中。)
Common SS(Common SS) Excel XLS and XLSX(Excel XLS 和 XLSX) poi-ooxml(poi-ooxml) WorkbookFactory and friends all require poi-ooxml, not just core poi(WorkbookFactory和它的朋友们都需要poi-ooxml,不只是核心poi)

This table maps artifacts into the jar file name. "version-yyyymmdd" is the POI version stamp. You can see what the latest stamp is on the downloads page.(此表将artifacts映射到jar文件名。 “version-yyyymmdd”是 POI 版本标记。您可以在下载页面上查看最新的版本标记。)

Maven artifactId(Maven artifactId) Prerequisites(Prerequisites) JAR(JAR)
poi(poi) jcl-over-slf4j (commons-logging replacement), commons-codec, commons-collections, commons-math(jcl-over-slf4j (commons-logging replacement)commons-codeccommons-collectionscommons-math) poi-version-yyyymmdd.jar(poi-version-yyyymmdd.jar)
poi-scratchpad(poi-scratchpad) poi(poi) poi-scratchpad-version-yyyymmdd.jar(poi-scratchpad-version-yyyymmdd.jar)
poi-ooxml(poi-ooxml) poi, poi-ooxml-lite, commons-compress, SparseBitSet
For SVG support: batik-all, xml-apis-ext, xmlgraphics-commons
For PDF support: pdfbox, fontbox, rototor graphics2d
对于 SVG 支持:batik-allxml-apis-extxmlgraphics-commons
对于 PDF 支持:pdfboxfontboxrototor graphics2d)
poi-ooxml-lite(poi-ooxml-lite) xmlbeans(xmlbeans) poi-ooxml-lite-version-yyyymmdd.jar(poi-ooxml-lite-version-yyyymmdd.jar)
poi-examples(poi-examples) poi, poi-scratchpad, poi-ooxml(poipoi-scratchpadpoi-ooxml) poi-examples-version-yyyymmdd.jar(poi-examples-version-yyyymmdd.jar)
poi-ooxml-full (known as ooxml-schemas)(poi-ooxml-full(称为 ooxml-schemas)) xmlbeans
For signing: bcpkix-jdk15on, bcprov-jdk15on, xmlsec, slf4j-api


Note (笔记)
Apache commons-math3 and commons-compress were added as a dependency in POI 4.0.0.
Zaxxer SparseBitSet was added as a dependency in POI 4.1.2
(Apache commons-math3 和 commons-compress 作为 POI 4.0.0 中的依赖项添加。 Zaxxer SparseBitSet 作为 POI 4.1.2 中的依赖项添加)

poi-ooxml requires poi-ooxml-lite. This is a substantially smaller version of the poi-ooxml-full jar (ooxml-schemas-1.4.jar for POI 4.0.0, ooxml-schemas-1.3.jar for POI 3.14 or to POI 3.17, ooxml-schemas-1.1.jar for POI 3.7 up to POI 3.13, ooxml-schemas-1.0.jar for POI 3.5 and 3.6). The larger ooxml-schemas jar is normally only required for development. Similarly, the ooxml-security jar, which contains all of the classes relating to encryption and signing, is normally only required for development. A subset of its contents are in poi-ooxml-schemas. This JAR is ooxml-security-1.1.jar for POI 3.14 onwards and ooxml-security-1.0.jar prior to that.(poi-ooxml需要poi-ooxml-lite,poi-ooxml-lite是 poi-ooxml-full jar 的精简版(ooxml-schemas-1.4.jar用于 POI 4.0.0,ooxml-schemas-1.3.jar用于 POI 3.14 或 POI 3.17,ooxml-schemas-1.1.jar用于POI 3.7到POI 3.13,ooxml-schemas-1.0.jar用于POI3.5和3.6)。较大的 ooxml-schemas jar 通常仅用于开发。同样,包含所有与加密和签名相关的类的 ooxml-security jar 通常也只在开发时使用。它的一部分内容在 poi-ooxml-schemas 中。从POI 3.14开始,此JAR为ooxml-security-1.1.jar,在此之前为ooxml-security-1.0.jar.)

The OOXML jars require a stax implementation, but now that Apache POI requires Java 8, that dependency is provided by the JRE and no additional stax jars are required. The OOXML jars used to require DOM4J, but the code has now been changed to use JAXP and no additional dom4j jars are required. By the way, look at this FAQ if you have problems when using a non-Oracle JDK.(OOXML jar 需要 stax 实现,但现在 Apache POI 需要 Java 8,该依赖项由 JRE 提供,不需要额外的 stax jar。 OOXML jar 过去需要 DOM4J,但现在代码已更改为使用 JAXP,并且不需要额外的 dom4j jar。顺便说一句,如果您在使用非 Oracle JDK 时遇到问题,请查看此常见问题解答)

The ooxml schemas jars are compiled with Apache XMLBeans 2.3, and so can be used at runtime with any version of XMLBeans from 3.0.0 or newer. Wherever possible though, we recommend that you use XMLBeans 3.1.0 with Apache POI, and that is the version now shipped in the binary release packages.(ooxml 模式 jar 使用 Apache XMLBeans 2.3 编译,因此可以在运行时与 3.0.0 或任何更高版本的XMLBeans 一起使用。不过,我们建议您尽可能将 XMLBeans 3.1.0 与 Apache POI 一起使用,这是二进制发布包中现在提供的版本。)


Small sample programs using the POI API are available in the src/examples (viewvc) directory of the source distribution.(使用 POI API 的示例程序可在源代码分发的 src/examples (viewvc) 目录中找到。)

All of the examples are included in POI distributions as a poi-examples artifact.(所有示例都作为 poi-examples 工件包含在 POI 分发中。)

Running POI on other JVM languages(在其他 JVM 语言上运行 POI)

POI can be run on most languages that run on the JVM. For code examples, see Running POI on other JVM languages(POI 可以在 JVM 上运行的大多数语言上运行。有关代码示例,请参阅在其他 JVM 语言上运行 POI)

Contributed Software(软件贡献)

Besides the "official" components outlined above there is some further software distributed with POI. This is called "contributed" software. It is not explicitly recommended or even maintained by the POI team, but it might still be useful to you.(除了上面列出的“官方”组件之外,还有一些与 POI 一起分发的其他软件。这称为“贡献”软件。 POI 团队没有明确推荐甚至维护它,但它可能对您仍然有用。)

See POI Ruby Bindings and other code in the poi-contrib module(请参阅poi-contrib 模块中的POI Ruby 绑定和其他代码)

by Andrew C. Oliver, Rainer Klute, David Fisher(作者:Andrew C. Oliver、Rainer Klute、David Fisher)

中英文 | 中文 | 英文