Apache POI - HDGF and XDGF - Java API To Access Microsoft Visio Format Files(Apache POI - HDGF 和 XDGF - 访问 Microsoft Visio 格式文件的 Java API)

Overview(概述)

Overview(概述)

HDGF is the POI Project's pure Java implementation of the Visio binary (VSD) file format. XDGF is the POI Project's pure Java implementation of the Visio XML (VSDX) file format.(HDGF 是 POI 项目的 Visio 二进制 (VSD) 文件格式的纯 Java 实现。 XDGF 是 POI 项目的 Visio XML (VSDX) 文件格式的纯 Java 实现。)

Currently, HDGF provides a low-level, read-only api for accessing Visio documents. It also provides a way to extract the textual content from a file.(目前,HDGF 提供了一个用于访问 Visio 文档的低级只读 api。它还提供了一种从文件中提取文本内容的方法。)

At this time, there is no usermodel api or similar, only low level access to the streams, chunks and chunk commands. Users are advised to check the unit tests to see how everything works. They are also well advised to read the documentation supplied with vsdump to get a feel for how Visio files are structured.(目前没有 usermodel api 或类似的api,只有对流、块和块命令的低级访问。建议用户检查单元测试以了解一切如何运行。也建议他们阅读随 vsdump 提供的文档,以了解 Visio 文件的结构。)

To get a feel for the contents of a file, and to track down where data of interest is stored, HDGF comes with VSDDumper to print out the contents of the file. Users should also make use of vsdump to probe the structure of files.(为了了解文件的内容,并追踪感兴趣的数据的存储位置,带有 VSDDumper的HDGF会打印文件的内容。用户还应该使用 vsdump 来探测文件的结构。)

Note (注意)
This code currently lives the scratchpad area of the POI SVN repository. To use this component, ensure you have the Scratchpad Jar on your classpath, or a dependency defined on the poi-scratchpad artifact - the main POI jar is not enough! See the POI Components Map for more details. (此代码当前位于 POI SVN 存储库的暂存区。要使用这个组件,请确保您的类路径中有 Scratchpad Jar,或者在 poi-scratchpad 工件上定义了依赖项 - 主 POI jar 是不够的!有关详细信息,请参阅 POI 组件图。)

Steps required for write support(写入支持所需的步骤)

Currently, HDGF is only able to read visio files, it is not able to write them back out again. We believe the following are the steps that would need to be taken to implement it.(目前,HDGF 只能读取 visio 文件,无法再次将它们写回。我们认为以下是实施它需要采取的步骤。)

  1. Re-write the decompression support in LZW4HDGF as HDGFLZW, which will be much better documented, and also under the ASL. Completed October 2007(将 LZW4HDGF 中的解压缩支持重写为 HDGFLZW,这将有更好的文档记录,并且也在 ASL 下。 2007 年 10 月完成)
  2. Add compression support to HDGFLZW. In progress - works for small streams but encoding goes wrong on larger ones(为 HDGFLZW 添加压缩支持。进行中 - 适用于小流,但在大流上编码出错)
  3. Have HDGF just write back the raw bytes it read in, and have a test to ensure the file is un-changed.(让 HDGF 只写回它读入的原始字节,并进行测试以确保文件未更改。)
  4. Have HDGF generate the bytes to write out from the Stream stores, using the compressed data as appropriate, without re-compressing. Plus test to ensure file is un-changed.(让 HDGF 生成要从 Stream 存储中写出的字节,使用适当的压缩数据,无需重新压缩。加上测试以确保文件未更改。)
  5. Have HDGF generate the bytes to write out from the Stream stores, re-compressing any streams that were decompressed. Plus test to ensure file is un-changed.(让 HDGF 生成要从 Stream 存储中写出的字节,重新压缩任何已解压缩的流。加上测试以确保文件未更改。)
  6. Have HDGF re-generate the offsets in pointers for the locations of the streams. Plus test to ensure file is un-changed.(让 HDGF 在指针中重新生成流位置的偏移量。加上测试以确保文件未更改。)
  7. Have HDGF re-generate the bytes for all the chunks, from the chunk commands. Tests to ensure the chunks are serialized properly, and then that the file is un-changed(让 HDGF 从块命令中重新生成所有块的字节。测试以确保块被正确序列化,然后文件未更改)
  8. Alter the data of one command, but keep it the same length, and check visio can open the file when written out.(修改一个命令的数据,但保持长度不变,写出时检查visio能否打开文件。)
  9. Alter the data of one command, to a new length, and check that visio can open the file when written out.(将一个命令的数据更改为新的长度,并检查 visio 在写出时是否可以打开文件。)

by POI Developers(通过 POI 开发人员)

 
中英文 | 中文 | 英文