Class BaseMatching

  • All Implemented Interfaces:
    Matching
    Direct Known Subclasses:
    Full, Full

    public abstract class BaseMatching
    extends java.lang.Object
    implements Matching
    Performs the matching of documents with a query, by first assigning scores to documents for each query term and modifying these scores with the appropriate modifiers. Then, a series of document score modifiers are applied if necessary.

    Properties

    • matching.retrieved_set_size - The maximum number of documents in the final retrieved set. The default value is 1000, however, setting the property to 0 will return all matched documents.
    • ignore.low.idf.terms - A property that enables to ignore the terms with a low IDF.
    • match.empty.query - whether an empty query should return all documents. Defaults to false.
    Since:
    3.0
    Author:
    Vassilis Plachouras, Craig Macdonald, Nicola Tonellotto
    • Field Detail

      • BASE_MATCHING_TAG

        public static final java.lang.String BASE_MATCHING_TAG
        See Also:
        Constant Field Values
      • logger

        protected static final org.slf4j.Logger logger
        the logger for this class
      • dsmNamespace

        protected static java.lang.String dsmNamespace
        the default namespace for the document score modifiers that are specified in the properties file.
      • IGNORE_LOW_IDF_TERMS

        protected static boolean IGNORE_LOW_IDF_TERMS
        A property that enables to ignore the terms with a low IDF. Corresponds to property ignore.low.idf.terms. Defaults to true. This can cause some query terms to be omitted in small corpora.
      • MATCH_EMPTY_QUERY

        protected static boolean MATCH_EMPTY_QUERY
        A property that when it is true, it allows matching all documents to an empty query. In this case the ordering of documents is random. More specifically, it is the ordering of documents in the document index.
      • index

        protected Index index
        The index used for retrieval.
      • lexicon

        protected Lexicon<java.lang.String> lexicon
        The lexicon used.
      • collectionStatistics

        protected CollectionStatistics collectionStatistics
        The collection statistics
      • documentModifiers

        protected java.util.List<DocumentScoreModifier> documentModifiers
        Contains the document score modifiers to be applied for a query.
    • Constructor Detail

      • BaseMatching

        protected BaseMatching()
      • BaseMatching

        public BaseMatching​(Index _index)
        Constructs an instance of the BaseMatching
        Parameters:
        _index -
    • Method Detail

      • setCollectionStatistics

        public void setCollectionStatistics​(CollectionStatistics cs)
        Update the collection statistics being used by this matching instance
        Specified by:
        setCollectionStatistics in interface Matching
        Parameters:
        cs - CollectionStatistics to use during matching
      • getInfo

        public abstract java.lang.String getInfo()
        Return a human readable description of this Matching class
        Specified by:
        getInfo in interface Matching
      • match

        public abstract ResultSet match​(java.lang.String queryNumber,
                                        MatchingQueryTerms queryTerms)
                                 throws java.io.IOException
        Get a ResultSet for the given query terms.
        Specified by:
        match in interface Matching
        Parameters:
        queryNumber - - some ID of the query
        queryTerms - - query terms to match
        Returns:
        ResultSet - the matched results
        Throws:
        java.io.IOException - if a problem occurs during matching