Class CollectionResultSet

  • All Implemented Interfaces:
    java.io.Serializable, ResultSet
    Direct Known Subclasses:
    QueryResultSet

    public class CollectionResultSet
    extends java.lang.Object
    implements ResultSet, java.io.Serializable
    This class implements the interface ResultSet and models the set of all documents in the collection. It encapsulates two arrays, one for the docids and one for the scores. It has also the occurrences matrix which counts how many query terms appear in each of the retrieved documents. The metadata related methods are empty.
    This class is only used internally by the Matching class and the classes that extent it, because the arrays for the docids and scores contain one entry for each document in the collection. Therefore, the instantiation of an object of this class can be expensive. Access to the retrieved documents is enabled by using the method GetResultSet that returns a cropped result set.
    Author:
    Vassilis plachouras
    See Also:
    Serialized Form
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected boolean arraysInitialised
      A static boolean flag indicating whether the arrays of docids and scores have been initialized (memory allocated for them) or not.
      protected int[] docids
      The array that stores the document ids.
      protected int exactResultSize
      The number of retrieved documents.
      protected java.util.concurrent.locks.Lock lock
      A lock for enabling access of the result set by different threads
      protected short[] occurrences
      The occurrences of the query terms in a document.
      protected int resultSize
      The number of documents that have been ranked and sorted according to their scores.
      protected double[] scores
      An array holding the scores of documents in the collection.
      protected int statusCode
      A status code for the result set.
    • Constructor Summary

      Constructors 
      Constructor Description
      CollectionResultSet​(int numberOfDocuments)
      A default constructor for the result set with a given number of documents.
      CollectionResultSet​(int[] _docids, double[] _scores, short[] _occurrences)
      Construct a resultset from the following components
      CollectionResultSet​(ResultSet resultSet)
      A default constructor for the result set with a given instance of the result set.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void addMetaItem​(java.lang.String name, int docid, java.lang.String value)
      Empty method.
      void addMetaItems​(java.lang.String name, java.lang.String[] values)
      Empty method.
      java.lang.String[][] allMetaItems()  
      int[] getDocids()
      Returns the documents ids after retrieval
      int getExactResultSize()
      Returns the exact size of the result set.
      java.util.concurrent.locks.Lock getLock()
      Returns the lock for enabling the modification of the result set by more than one threads.
      java.lang.String getMetaItem​(java.lang.String name, int rank)
      Empty method.
      java.lang.String[] getMetaItems​(java.lang.String name)
      Empty method.
      java.lang.String[] getMetaKeys()
      Returns the names of the meta keys which this resultset has
      short[] getOccurrences()
      Returns the occurrences array.
      ResultSet getResultSet​(int[] positions)
      Extracts a subset of the resultset given by the list parameter, which contains a list of positions in the resultset that should be saved.
      ResultSet getResultSet​(int start, int length)
      Crops the existing result file and extracts a subset from the given starting point to the ending point.
      int getResultSize()
      Returns the effective size of the result set.
      double[] getScores()
      Returns the documents scores after retrieval
      int getStatusCode()
      Returns the status code for the current result set.
      boolean hasMetaItems​(java.lang.String name)
      Returns true if the resultset already has a set of metaitems with the specified name.
      void initialise()
      Initialises the arrays prior of retrieval.
      void initialise​(double[] scs)
      Initialises the result set with the given scores.
      void setExactResultSize​(int newExactResultSize)
      Sets the exact size of the result set, that is the number of documents that would be retrieved, if the result set was not truncated.
      void setResultSize​(int newResultSize)
      Sets the effective size of the result set, that is the number of documents to be sorted after retrieval.
      void setStatusCode​(int _statusCode)
      Sets the status code for the current result set.
      void sort()
      Sorts all documents in this resultset by descending score
      void sort​(int topDocs)
      Sorts the top topDocs document in this resultset be first.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • docids

        protected int[] docids
        The array that stores the document ids.
      • scores

        protected double[] scores
        An array holding the scores of documents in the collection.
      • occurrences

        protected short[] occurrences
        The occurrences of the query terms in a document. This allows to use Boolean operators and filter documents. If the i-th query term appears in the j-th document, then the i-th bit of occurrences[j] is set, otherwise it is zero. Using a 2-byte long integer allows to check for the occurence of up to 16 query terms.
      • arraysInitialised

        protected boolean arraysInitialised
        A static boolean flag indicating whether the arrays of docids and scores have been initialized (memory allocated for them) or not.
      • resultSize

        protected int resultSize
        The number of documents that have been ranked and sorted according to their scores.
      • exactResultSize

        protected int exactResultSize
        The number of retrieved documents. This may be higher that <tt>resultSize<tt>, and corresponds to the number of documents that contain at least one query term.
      • lock

        protected java.util.concurrent.locks.Lock lock
        A lock for enabling access of the result set by different threads
      • statusCode

        protected int statusCode
        A status code for the result set.
    • Constructor Detail

      • CollectionResultSet

        public CollectionResultSet​(int[] _docids,
                                   double[] _scores,
                                   short[] _occurrences)
        Construct a resultset from the following components
      • CollectionResultSet

        public CollectionResultSet​(int numberOfDocuments)
        A default constructor for the result set with a given number of documents.
        Parameters:
        numberOfDocuments - the number of documents contained in the result set.
      • CollectionResultSet

        public CollectionResultSet​(ResultSet resultSet)
        A default constructor for the result set with a given instance of the result set. The constructor creates a clone of the given result set.
        Parameters:
        resultSet - The given result set.
    • Method Detail

      • getLock

        public java.util.concurrent.locks.Lock getLock()
        Returns the lock for enabling the modification of the result set by more than one threads.
        Specified by:
        getLock in interface ResultSet
        Returns:
        the lock.
      • getStatusCode

        public int getStatusCode()
        Returns the status code for the current result set.
        Specified by:
        getStatusCode in interface ResultSet
        Returns:
        a integer status code. 0 stands success. 1 stands for empty result set. 2 stands for wrong setting of start/end parameters. 3 stands for query timeout. The values assigned to the status codes are increasing accordingly to the severity of the status.
      • setStatusCode

        public void setStatusCode​(int _statusCode)
        Sets the status code for the current result set.
        Specified by:
        setStatusCode in interface ResultSet
        Parameters:
        _statusCode - - the code to return to the user
      • getDocids

        public int[] getDocids()
        Returns the documents ids after retrieval
        Specified by:
        getDocids in interface ResultSet
        Returns:
        the docids
      • getResultSize

        public int getResultSize()
        Returns the effective size of the result set.
        Specified by:
        getResultSize in interface ResultSet
        Returns:
        int the effective size of the result set
      • getOccurrences

        public short[] getOccurrences()
        Returns the occurrences array.
        Specified by:
        getOccurrences in interface ResultSet
        Returns:
        int[] the array the occurrences array.
      • getExactResultSize

        public int getExactResultSize()
        Returns the exact size of the result set.
        Specified by:
        getExactResultSize in interface ResultSet
        Returns:
        int the exact size of the result set
      • getScores

        public double[] getScores()
        Returns the documents scores after retrieval
        Specified by:
        getScores in interface ResultSet
        Returns:
        score list in same order as docids array
      • initialise

        public void initialise()
        Initialises the arrays prior of retrieval.
        Specified by:
        initialise in interface ResultSet
      • initialise

        public void initialise​(double[] scs)
        Initialises the result set with the given scores. If the length of the given array is different than the length of the internal arrays, then we re-allocate memory and create the arrays.
        Specified by:
        initialise in interface ResultSet
        Parameters:
        scs - double[] the scores to initiliase the result set with.
      • setResultSize

        public void setResultSize​(int newResultSize)
        Sets the effective size of the result set, that is the number of documents to be sorted after retrieval.
        Specified by:
        setResultSize in interface ResultSet
        Parameters:
        newResultSize - int the effective size of the result set.
      • setExactResultSize

        public void setExactResultSize​(int newExactResultSize)
        Sets the exact size of the result set, that is the number of documents that would be retrieved, if the result set was not truncated.
        Specified by:
        setExactResultSize in interface ResultSet
        Parameters:
        newExactResultSize - int the effective size of the result set.
      • addMetaItem

        public void addMetaItem​(java.lang.String name,
                                int docid,
                                java.lang.String value)
        Empty method.
        Specified by:
        addMetaItem in interface ResultSet
        Parameters:
        name - the name of the metadata type. For example, it can be the url for adding the URLs of documents.
        docid - the document identifier of the document.
        value - the metadata value.
      • addMetaItems

        public void addMetaItems​(java.lang.String name,
                                 java.lang.String[] values)
        Empty method.
        Specified by:
        addMetaItems in interface ResultSet
        Parameters:
        name - the name of the metadata type. For example, it can be the url for adding the URLs of documents.
        values - the metadata values.
      • getMetaItem

        public java.lang.String getMetaItem​(java.lang.String name,
                                            int rank)
        Empty method.
        Specified by:
        getMetaItem in interface ResultSet
        Parameters:
        name - the name of the metadata type.
        rank - the rank of the document.
        Returns:
        a string with the metadata information, or null of the metadata is not available.
      • getMetaItems

        public java.lang.String[] getMetaItems​(java.lang.String name)
        Empty method.
        Specified by:
        getMetaItems in interface ResultSet
        Parameters:
        name - the name of the metadata type.
        Returns:
        an array of strings with the metadata information, or null of the metadata is not available.
      • allMetaItems

        public java.lang.String[][] allMetaItems()
        Specified by:
        allMetaItems in interface ResultSet
      • getResultSet

        public ResultSet getResultSet​(int start,
                                      int length)
        Crops the existing result file and extracts a subset from the given starting point to the ending point.
        Specified by:
        getResultSet in interface ResultSet
        Parameters:
        start - the beginning of the subset.
        length - the end of the subset.
        Returns:
        ResultSet a subset of the current result set.
      • getResultSet

        public ResultSet getResultSet​(int[] positions)
        Extracts a subset of the resultset given by the list parameter, which contains a list of positions in the resultset that should be saved.
        NB:The metadata hashtable is NOT reduced.
        Specified by:
        getResultSet in interface ResultSet
        Parameters:
        positions - int[] the list of elements in the current list that should be kept.
        Returns:
        a subset of the current result set specified by the list.
      • hasMetaItems

        public boolean hasMetaItems​(java.lang.String name)
        Returns true if the resultset already has a set of metaitems with the specified name.
        Specified by:
        hasMetaItems in interface ResultSet
        Parameters:
        name - of the desired metaitem set
        Returns:
        true if the set exists.
      • getMetaKeys

        public java.lang.String[] getMetaKeys()
        Returns the names of the meta keys which this resultset has
        Specified by:
        getMetaKeys in interface ResultSet
        Returns:
        the list of key names
      • sort

        public void sort()
        Description copied from interface: ResultSet
        Sorts all documents in this resultset by descending score
        Specified by:
        sort in interface ResultSet
      • sort

        public void sort​(int topDocs)
        Description copied from interface: ResultSet
        Sorts the top topDocs document in this resultset be first. The order of the remaining documents is undefined.
        Specified by:
        sort in interface ResultSet
        Parameters:
        topDocs - number of documents to top-rank