Class BA
- java.lang.Object
-
- org.terrier.matching.models.queryexpansion.QueryExpansionModel
-
- org.terrier.matching.models.queryexpansion.BA
-
public class BA extends QueryExpansionModel
This class implements an approximation of the binomial distribution through the Kullback-Leibler divergence to weight query terms for query expansion. The class is named BA, which standard for Binomial Approximation. That is F * D(f, p)+0.5*log_2 (2*PI �tf(1-f)) with D the Kullback Leibler divergence, f the MLE estimate of the term frequency in the retrieved set (sample), F the sample size, p the prior of the term See Equation (8) on page 365 of the paper: Gianni Amati and Cornelis Joost Van Rijsbergen. 2002. Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans. Inf. Syst. 20, 4 (October 2002), 357-389. DOI=10.1145/582415.582416 http://doi.acm.org/10.1145/582415.582416 The description of the query expansion technique and models can be found in Amati, Giambattista (2003),�Probability Models for Information Retrieval based on Divergence from Randomness (pdf). PhD thesis, University of Glasgow.- Author:
- Gianni Amati
-
-
Field Summary
-
Fields inherited from class org.terrier.matching.models.queryexpansion.QueryExpansionModel
averageDocumentLength, collectionLength, documentFrequency, EXPANSION_DOCUMENTS, EXPANSION_TERMS, idf, maxTermFrequency, numberOfDocuments, PARAMETER_FREE, ROCCHIO_BETA, totalDocumentLength
-
-
Constructor Summary
Constructors Constructor Description BA()A default constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.StringgetInfo()Returns the name of the model.doubleparameterFreeNormaliser()This method provides the contract for computing the normaliser of parameter-free query expansion.doubleparameterFreeNormaliser(double maxTermFrequency, double collectionLength, double totalDocumentLength)This method provides the contract for computing the normaliser of parameter-free query expansion.doublescore(double withinDocumentFrequency, double termFrequency)This method implements the query expansion model.doublescore(double withinDocumentFrequency, double termFrequency, double totalDocumentLength, double collectionLength, double averageDocumentLength)This method implements the query expansion model.-
Methods inherited from class org.terrier.matching.models.queryexpansion.QueryExpansionModel
initialise, setAverageDocumentLength, setCollectionLength, setDocumentFrequency, setMaxTermFrequency, setNumberOfDocuments, setTotalDocumentLength
-
-
-
-
Method Detail
-
getInfo
public final java.lang.String getInfo()
Returns the name of the model.- Specified by:
getInfoin classQueryExpansionModel- Returns:
- the name of the model
-
parameterFreeNormaliser
public final double parameterFreeNormaliser()
This method provides the contract for computing the normaliser of parameter-free query expansion.- Specified by:
parameterFreeNormaliserin classQueryExpansionModel- Returns:
- The normaliser.
-
parameterFreeNormaliser
public final double parameterFreeNormaliser(double maxTermFrequency, double collectionLength, double totalDocumentLength)This method provides the contract for computing the normaliser of parameter-free query expansion.- Specified by:
parameterFreeNormaliserin classQueryExpansionModel- Parameters:
maxTermFrequency- The maximum of the in-collection term frequency of the terms in the pseudo relevance set.collectionLength- The number of tokens in the collections.totalDocumentLength- The sum of the length of the top-ranked documents.- Returns:
- The normaliser.
-
score
public final double score(double withinDocumentFrequency, double termFrequency)This method implements the query expansion model.- Specified by:
scorein classQueryExpansionModel- Parameters:
withinDocumentFrequency- double The term frequency in the X top-retrieved documents.termFrequency- double The term frequency in the collection.- Returns:
- double The query expansion weight using he complete Kullback-Leibler divergence.
-
score
public final double score(double withinDocumentFrequency, double termFrequency, double totalDocumentLength, double collectionLength, double averageDocumentLength)This method implements the query expansion model.- Specified by:
scorein classQueryExpansionModel- Parameters:
withinDocumentFrequency- double The term frequency in the X top-retrieved documents.termFrequency- double The term frequency in the collection.totalDocumentLength- double The sum of length of the X top-retrieved documents.collectionLength- double The number of tokens in the whole collection.averageDocumentLength- double The average document length in the collection.- Returns:
- double The score returned by the implemented model.
-
-