Class LookAheadStream

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable
    Direct Known Subclasses:
    LookAheadStreamCaseInsensitive

    public class LookAheadStream
    extends java.io.InputStream
    Implements an InputStream, that encapsulates another stream, but only upto the point that a pre-defined end marker in the stream is identified. The Reader will then become endOfFile, and refuse to return any more bytes from the stream. Suppose that we create an instance of a LookAheadStream with the end marker END. For the following input: a b c d END e f g... the LookAheadStream, will stop after reading the string END. Note that the end marker will be missing from the parent stream.

    LookAheadStream allows the encoding to be changed between markers. Handy for collections of webpages, which may use different encodings. However, the end marker must be obtainable using the default encoding.

    Author:
    Craig Macdonald, Vassilis Plachouras
    See Also:
    LookAheadReader
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected int[] Buffer
      The read ahead buffer
      protected int BufIndex
      index of the first entry in the buffer
      protected int BufLen
      How many bytes are in the read ahead buffer
      protected byte[] EndMarker
      the end marker that it is pre-scanning the stream for
      protected boolean EOF
      have we reached the end of the file
      protected int MarkerLen
      How long is the end marker
      protected java.io.InputStream ParentStream
      the parent stream that this object is looking ahead in
    • Constructor Summary

      Constructors 
      Constructor Description
      LookAheadStream​(java.io.InputStream parent, byte[] endMarker)
      Creates an instance of a LookAheadStream that will read from the given stream until the end marker byte pattern is found.
      LookAheadStream​(java.io.InputStream parent, java.lang.String endMarker)
      Creates an instance of a LookAheadStream that will read from the given stream until the end marker is found.
      LookAheadStream​(java.io.InputStream parent, java.lang.String endMarker, java.lang.String charSet)
      Creates an instance of a LookAheadStream that will read from the given stream until the end marker is found.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void close()
      Closes the current stream, by setting the end of file flag equal to true.
      void mark​(int x)
      This method is not implemented.
      boolean markSupported()
      Support for marking is not implemented.
      int read()
      Read a byte from the parent stream, first checking that it doesn't form part of the end marker.
      int read​(byte[] cbuf)
      Read bytes into an array.
      int read​(byte[] cbuf, int offset, int len)
      Read bytes into a portion of an array.
      boolean ready()
      Indicates whether there are more bytes available to read from the stream.
      void reset()
      Reset the stream.
      long skip​(long n)
      Skips n bytes from the stream.
      • Methods inherited from class java.io.InputStream

        available, nullInputStream, readAllBytes, readNBytes, readNBytes, transferTo
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • ParentStream

        protected final java.io.InputStream ParentStream
        the parent stream that this object is looking ahead in
      • EndMarker

        protected final byte[] EndMarker
        the end marker that it is pre-scanning the stream for
      • MarkerLen

        protected final int MarkerLen
        How long is the end marker
      • BufLen

        protected int BufLen
        How many bytes are in the read ahead buffer
      • BufIndex

        protected int BufIndex
        index of the first entry in the buffer
      • Buffer

        protected final int[] Buffer
        The read ahead buffer
      • EOF

        protected boolean EOF
        have we reached the end of the file
    • Constructor Detail

      • LookAheadStream

        public LookAheadStream​(java.io.InputStream parent,
                               java.lang.String endMarker)
        Creates an instance of a LookAheadStream that will read from the given stream until the end marker is found. NB:. This constructor assumes the default charset.
        Parameters:
        parent - InputStream the stream used for reading the input.
        endMarker - String the marker which signifies the end of the stream. Not deprecated, but recommended to use LookAheadStream(InputStream parent, String endMarker, String charSet) instead.
      • LookAheadStream

        public LookAheadStream​(java.io.InputStream parent,
                               java.lang.String endMarker,
                               java.lang.String charSet)
                        throws java.io.UnsupportedEncodingException
        Creates an instance of a LookAheadStream that will read from the given stream until the end marker is found. The end marker is decoded from bytes using the described charSet.
        Parameters:
        parent - InputStream the stream used for reading the input.
        endMarker - String the marker which signifies the end of the stream.
        charSet - String the name of the character set to use.
        Throws:
        java.io.UnsupportedEncodingException
      • LookAheadStream

        public LookAheadStream​(java.io.InputStream parent,
                               byte[] endMarker)
        Creates an instance of a LookAheadStream that will read from the given stream until the end marker byte pattern is found.
        Parameters:
        parent - InputStream the stream used for reading the input.
        endMarker - String the marker which signifies the end of the stream.
    • Method Detail

      • read

        public int read()
                 throws java.io.IOException
        Read a byte from the parent stream, first checking that it doesn't form part of the end marker.
        Specified by:
        read in class java.io.InputStream
        Returns:
        int the code of the read byte, or -1 if the end of the stream has been reached.
        Throws:
        java.io.IOException - if there is any error while reading from the stream.
      • read

        public int read​(byte[] cbuf)
                 throws java.io.IOException
        Read bytes into an array. This method will read 100 bytes or the array length, and until the end of the stream is reached. NB: Uses read() internally.
        Overrides:
        read in class java.io.InputStream
        Parameters:
        cbuf - cbuf - Destination buffer
        Returns:
        The number of bytes read, or -1 if the end of the stream has been reached.
        Throws:
        java.io.IOException - If an I/O error occurs
      • read

        public int read​(byte[] cbuf,
                        int offset,
                        int len)
                 throws java.io.IOException
        Read bytes into a portion of an array. It will try to read the specified number of bytes into the buffer. NB:Implemented in terms of read().
        Overrides:
        read in class java.io.InputStream
        Parameters:
        cbuf - Destination buffer
        offset - Offset at which to start storing bytes
        len - Maximum number of bytes to read
        Returns:
        The number of bytes read, or -1 if the end of the stream has been reached
        Throws:
        java.io.IOException - If an I/O error occurs
      • reset

        public void reset()
                   throws java.io.IOException
        Reset the stream. Attempts to reset it in some way appropriate to the particular stream, for example by positioning it to its starting point. Not all input streams support the reset() operation. Use at your own risk.
        Overrides:
        reset in class java.io.InputStream
        Throws:
        java.io.IOException - thrown if ParentStream.reset();
      • skip

        public long skip​(long n)
                  throws java.io.IOException
        Skips n bytes from the stream. If the end of the stream has been reached before reading n bytes, then it returns. NB: This method uses read() internally.
        Overrides:
        skip in class java.io.InputStream
        Parameters:
        n - long the number of bytes to skip.
        Returns:
        long the number of bytes skipped.
        Throws:
        java.io.IOException - if there is any error while reading from the stream.
      • ready

        public boolean ready()
                      throws java.io.IOException
        Indicates whether there are more bytes available to read from the stream.
        Returns:
        boolean true if there are more bytes available for reading, otherwise it returns false.
        Throws:
        java.io.IOException - if there is any error while reading from the stream.
      • close

        public void close()
                   throws java.io.IOException
        Closes the current stream, by setting the end of file flag equal to true. Does NOT close the wrapped stream.
        Specified by:
        close in interface java.lang.AutoCloseable
        Specified by:
        close in interface java.io.Closeable
        Overrides:
        close in class java.io.InputStream
        Throws:
        java.io.IOException
      • markSupported

        public boolean markSupported()
        Support for marking is not implemented.
        Overrides:
        markSupported in class java.io.InputStream
        Returns:
        boolean false.
      • mark

        public void mark​(int x)
        This method is not implemented.
        Overrides:
        mark in class java.io.InputStream