Class TRECResultsMatching

  • All Implemented Interfaces:
    Matching

    public class TRECResultsMatching
    extends java.lang.Object
    implements Matching
    A matching implementation that retrieves results from a TREC result file rather than the current index. Such a result file must be compatible with trec_eval, i.e., it should have the following format:
    queryID Q0 docno score rank label

    Properties:

    • matching.trecresults.file - the path to the TREC results file.
    • matching.trecresults.format - the input format to parse document identifiers. Defaults to DOCNO. DOCNO assumes that docno is a reverse lookup key in the MetaIndex. If DOCID is specified, then the docnos are assumed to represent Terrier's docids, as generated by TRECDocidOutputFormat.
    • matching.trecresults.scores - whether scores should be parsed. Defaults to true.
    • matching.trecresults.length - the maximum number of documents per query. Defaults to 1000. Note that setting this property to 0 may slow down the retrieval process for large collections, as a result set of the size of the collection will be allocated in memory.
    Author:
    Craig Macdonald, Rodrygo Santos
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected CollectionStatistics collStats
      The underlying collections statistics.
      protected int docid
      The current read document identifier.
      protected static java.lang.String DSMNS
      The default namespace for document score modifiers.
      protected java.util.List<DocumentScoreModifier> dsms
      The list of document score modifiers to be applied.
      protected java.lang.String filename
      The TREC results filename.
      protected TRECResultsMatching.InputFormat format
      The input format to use when parsing document identifiers.
      protected boolean found
      Whether the current query was found in the results file.
      protected Index index
      The underlying index.
      protected org.slf4j.Logger logger
      This object's logger.
      protected int maxResults
      The maximum number of results to read per query.
      protected boolean parseScores
      Whether document scores should be parsed from the results file.
      protected java.lang.String qid
      The current query id.
      protected java.io.BufferedReader reader
      The TREC results file reader.
      protected boolean reset
      Whether the current file has already been reset.
      protected ResultSet rs
      The result set for a query.
      protected double score
      The current read score.
      protected static java.util.regex.Pattern SPLIT_SPACE_PLUS  
    • Constructor Summary

      Constructors 
      Constructor Description
      TRECResultsMatching​(Index _index)
      Contructs an instance of the TRECResultsMatching given an index.
      TRECResultsMatching​(Index _index, java.lang.String _filename)
      Contructs an instance of the TRECResultsMatching.
      TRECResultsMatching​(Index _index, java.lang.String _filename, java.lang.String defDSMs)
      Contructs an instance of the TRECResultsMatching.
    • Field Detail

      • SPLIT_SPACE_PLUS

        protected static final java.util.regex.Pattern SPLIT_SPACE_PLUS
      • index

        protected Index index
        The underlying index.
      • DSMNS

        protected static final java.lang.String DSMNS
        The default namespace for document score modifiers.
        See Also:
        Constant Field Values
      • dsms

        protected java.util.List<DocumentScoreModifier> dsms
        The list of document score modifiers to be applied.
      • filename

        protected java.lang.String filename
        The TREC results filename.
      • reader

        protected java.io.BufferedReader reader
        The TREC results file reader.
      • parseScores

        protected final boolean parseScores
        Whether document scores should be parsed from the results file.
      • maxResults

        protected final int maxResults
        The maximum number of results to read per query.
      • qid

        protected java.lang.String qid
        The current query id.
      • rs

        protected ResultSet rs
        The result set for a query.
      • docid

        protected int docid
        The current read document identifier.
      • score

        protected double score
        The current read score.
      • found

        protected boolean found
        Whether the current query was found in the results file.
      • reset

        protected boolean reset
        Whether the current file has already been reset.
      • logger

        protected org.slf4j.Logger logger
        This object's logger.
    • Constructor Detail

      • TRECResultsMatching

        public TRECResultsMatching​(Index _index)
                            throws java.io.IOException
        Contructs an instance of the TRECResultsMatching given an index.
        Parameters:
        _index -
        Throws:
        java.io.IOException
      • TRECResultsMatching

        public TRECResultsMatching​(Index _index,
                                   java.lang.String _filename)
                            throws java.io.IOException
        Contructs an instance of the TRECResultsMatching.
        Parameters:
        _index -
        _filename -
        Throws:
        java.io.IOException
      • TRECResultsMatching

        public TRECResultsMatching​(Index _index,
                                   java.lang.String _filename,
                                   java.lang.String defDSMs)
                            throws java.io.IOException
        Contructs an instance of the TRECResultsMatching.
        Parameters:
        _index -
        _filename -
        defDSMs -
        Throws:
        java.io.IOException
    • Method Detail

      • reopen

        protected void reopen()
                       throws java.io.IOException
        Throws:
        java.io.IOException
      • initDSMs

        protected void initDSMs​(java.lang.String defDSMs)
      • getInfo

        public java.lang.String getInfo()
        Description copied from interface: Matching
        Return a human readable description of this Matching class
        Specified by:
        getInfo in interface Matching
      • getDocid

        protected int getDocid​(java.lang.String docno)
                        throws java.io.IOException
        Throws:
        java.io.IOException
      • read

        protected boolean read​(java.lang.String _qid)
                        throws java.io.IOException
        Throws:
        java.io.IOException
      • checkValid

        protected boolean checkValid()
      • match

        public ResultSet match​(java.lang.String _qid,
                               MatchingQueryTerms mqt)
                        throws java.io.IOException
        Description copied from interface: Matching
        Get a ResultSet for the given query terms.
        Specified by:
        match in interface Matching
        Parameters:
        _qid - - some ID of the query
        mqt - - query terms to match
        Returns:
        ResultSet - the matched results
        Throws:
        java.io.IOException - if a problem occurs during matching
      • setCollectionStatistics

        public void setCollectionStatistics​(CollectionStatistics _collStats)
        Description copied from interface: Matching
        Update the collection statistics being used by this matching instance
        Specified by:
        setCollectionStatistics in interface Matching
        Parameters:
        _collStats - CollectionStatistics to use during matching
      • getCollectionStatistics

        public CollectionStatistics getCollectionStatistics()
        Returns collection statistics.
        Returns:
        collection statistics
      • initialise

        protected void initialise​(int max)
        Initialises the current result set by allocating memory for max results.
        Parameters:
        max - The maximum number of results to be stored.
      • finalize

        protected void finalize()
                         throws java.lang.Throwable
        Overrides:
        finalize in class java.lang.Object
        Throws:
        java.lang.Throwable