Class POIDocument

    • Constructor Detail

      • POIDocument

        public POIDocument​(java.lang.String filename,
                           java.io.InputStream docStream,
                           Tokeniser tokeniser)
        Constructs a new MSWordDocument object for the file represented by docStream.
      • POIDocument

        public POIDocument​(java.io.InputStream docStream,
                           java.util.Map<java.lang.String,​java.lang.String> docProperties,
                           Tokeniser tok)
        Constructs a new MSWordDocument object for the file represented by docStream.
        Parameters:
        docStream -
        docProperties -
        tok -
    • Method Detail

      • getExtractor

        protected org.apache.poi.POITextExtractor getExtractor​(java.lang.String filename,
                                                               java.io.InputStream docStream)
                                                        throws java.io.IOException
        Throws:
        java.io.IOException
      • getReader

        protected java.io.Reader getReader​(java.io.InputStream docStream)
        Converts the docStream InputStream parameter into a Reader which contains plain text, and from which terms can be obtained. On failure, returns null and sets EOD to true, so no terms can be read from this object.
        Overrides:
        getReader in class FileDocument
        Parameters:
        docStream - an input stream that we want to access as a buffered reader.
        Returns:
        the buffered reader that encapsulates the given input stream.