public class OldExcelExtractor extends java.lang.Object implements POITextExtractor
Returns much (but not all) of the textual content of the file, suitable for indexing by something like Apache Lucene, or used by Apache Tika, but not really intended for display to the user.
(旧 Excel 文件的文本提取器,这些文件对于 HSSFWorkbook 来说太旧了,无法处理。这包括 Excel 95 和非常旧的(OLE2 之前的)Excel 文件,例如 Excel 4 文件。返回文件的大部分(但不是全部)文本内容,适合 Apache Lucene 之类的索引,或 Apache Tika 使用的索引,但并非真正用于向用户显示。)Constructor and Description |
---|
OldExcelExtractor(DirectoryNode directory) |
OldExcelExtractor(java.io.File f) |
OldExcelExtractor(java.io.InputStream input) |
OldExcelExtractor(POIFSFileSystem fs) |
Modifier and Type | Method and Description |
---|---|
int |
getBiffVersion()
The Biff version, largely corresponding to the Excel version
(Biff 版本,主要对应 Excel 版本)
|
java.lang.Object |
getDocument() |
java.io.Closeable |
getFilesystem() |
int |
getFileType()
The kind of the file, one of
BOFRecord.TYPE_WORKSHEET ,
BOFRecord.TYPE_CHART ,
BOFRecord.TYPE_EXCEL_4_MACRO or
BOFRecord.TYPE_WORKSPACE_FILE
(文件的种类,BOFRecord.TYPE_WORKSHEET、BOFRecord.TYPE_CHART、BOFRecord.TYPE_EXCEL_4_MACRO 或 BOFRecord.TYPE_WORKSPACE_FILE 之一)
|
POITextExtractor |
getMetadataTextExtractor()
Returns another text extractor, which is able to output the textual content of the document metadata / properties, such as author and title.
(返回另一个文本提取器,它能够输出文档元数据/属性的文本内容,例如作者和标题。)
|
java.lang.String |
getText()
Retrieves the text contents of the file, as best we can for these old file formats
(检索文件的文本内容,尽我们所能为这些旧文件格式)
|
protected void |
handleNumericCell(java.lang.StringBuilder text, double value) |
boolean |
isCloseFilesystem() |
static void |
main(java.lang.String[] args) |
void |
setCloseFilesystem(boolean doCloseFilesystem) |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
close
public OldExcelExtractor(java.io.InputStream input) throws java.io.IOException
java.io.IOException
(java.io.IOException)
public OldExcelExtractor(java.io.File f) throws java.io.IOException
java.io.IOException
(java.io.IOException)
public OldExcelExtractor(POIFSFileSystem fs) throws java.io.IOException
java.io.IOException
(java.io.IOException)
public OldExcelExtractor(DirectoryNode directory) throws java.io.IOException
java.io.IOException
(java.io.IOException)
public static void main(java.lang.String[] args) throws java.io.IOException
java.io.IOException
(java.io.IOException)
public int getBiffVersion()
public int getFileType()
BOFRecord.TYPE_WORKSHEET
,
BOFRecord.TYPE_CHART
,
BOFRecord.TYPE_EXCEL_4_MACRO
or
BOFRecord.TYPE_WORKSPACE_FILE
(文件的种类,BOFRecord.TYPE_WORKSHEET、BOFRecord.TYPE_CHART、BOFRecord.TYPE_EXCEL_4_MACRO 或 BOFRecord.TYPE_WORKSPACE_FILE 之一)
public java.lang.String getText()
getText
in interface
POITextExtractor
(接口 POITextExtractor 中的 getText)
protected void handleNumericCell(java.lang.StringBuilder text, double value)
public POITextExtractor getMetadataTextExtractor()
POITextExtractor
getMetadataTextExtractor
in interface
POITextExtractor
(接口 POITextExtractor 中的 getMetadataTextExtractor)
public void setCloseFilesystem(boolean doCloseFilesystem)
setCloseFilesystem
in interface
POITextExtractor
(接口 POITextExtractor 中的 setCloseFilesystem)
doCloseFilesystem
-
true
(default), if underlying resources/filesystem should be closed on
POITextExtractor.close()
(doCloseFilesystem - true(默认),如果底层资源/文件系统应该在 POITextExtractor.close() 上关闭)
public boolean isCloseFilesystem()
isCloseFilesystem
in interface
POITextExtractor
(接口 POITextExtractor 中的 isCloseFilesystem)
true
, if resources/filesystem should be closed on
POITextExtractor.close()
(true,如果资源/文件系统应该在 POITextExtractor.close() 上关闭)
public java.io.Closeable getFilesystem()
getFilesystem
in interface
POITextExtractor
(接口 POITextExtractor 中的 getFilesystem)
public java.lang.Object getDocument()
getDocument
in interface
POITextExtractor
(接口 POITextExtractor 中的 getDocument)
Copyright 2021 The Apache Software Foundation or its licensors, as applicable.