Package org.terrier.matching.models
Class WeightingModelLibrary
- java.lang.Object
-
- org.terrier.matching.models.WeightingModelLibrary
-
public class WeightingModelLibrary extends java.lang.ObjectA library of tf normalizations for weighting models such as the pivoted length normalization described in Singhal et al., 1996.- Since:
- 4.0
- Author:
- Francois Rousseau
-
-
Field Summary
Fields Modifier and Type Field Description static doubleLOG_2_OF_EThe logarithm in base 2 of e, used to change the base of logarithms.static doubleLOG_E_OF_2The natural logarithm of 2, used to change the base of logarithms.
-
Constructor Summary
Constructors Constructor Description WeightingModelLibrary()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static voidcheckForFields(CollectionStatistics _cs)static doublelog(double d)Returns the base 2 log of the given double precision number.static doublelog(double d1, double d2)Returns the base 2 log of d1 over d2static doublerelativeFrequency(double tf, double docLength)Computes relative term frequency.static doublestirlingPower(double n, double m)This method provides the contract for implementing the Stirling formula for the power series.static doubletf_concave_k(double tf, double k)Returns a concave tf as described in Robertson and Walker, 1994.static doubletf_concave_log(double tf)Returns a concave tf as described in Singhal et al., 1999.static doubletf_cornell(double tf, double s, double dl, double avdl)Returns a concave pivot length normalized tf as described in Singhal et al., 1999.static doubletf_pivoted(double tf, double slope, double dl, double avdl)Returns a modified tf with pivot length normalization as described in Singhal et al., 1996.static doubletf_robertson(double tf, double b, double dl, double avdl, double k1)Returns a concave pivot length normalized tf as described in Robertson et al., 1999.
-
-
-
Method Detail
-
checkForFields
public static void checkForFields(CollectionStatistics _cs)
-
log
public static double log(double d)
Returns the base 2 log of the given double precision number.- Parameters:
d- The number of which the log we will compute- Returns:
- the base 2 log of the given number
-
log
public static double log(double d1, double d2)Returns the base 2 log of d1 over d2- Parameters:
d1- the numeratord2- the denominator- Returns:
- the base 2 log of d1/d2
-
tf_pivoted
public static double tf_pivoted(double tf, double slope, double dl, double avdl)Returns a modified tf with pivot length normalization as described in Singhal et al., 1996. Pivoted document length normalization (SIGIR '96), pages 21-29.- Parameters:
tf- the term frequency to modifyslope- the slopedl- the document lengthavdl- the average document length in the collection- Returns:
- a pivot length normalized tf
-
tf_concave_k
public static double tf_concave_k(double tf, double k)Returns a concave tf as described in Robertson and Walker, 1994. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval (SIGIR '94), page 232-241.- Parameters:
tf- the term frequency to modifyk- the concavity coefficient- Returns:
- a concave tf
-
tf_concave_log
public static double tf_concave_log(double tf)
Returns a concave tf as described in Singhal et al., 1999. AT&T at TREC-7. In Proceedings of the Seventh Text REtrieval Conference (TREC-7), pages 239-252.- Parameters:
tf- the term frequency to modify- Returns:
- a concave tf
-
relativeFrequency
public static final double relativeFrequency(double tf, double docLength)Computes relative term frequency. When tf == docLength we return 0.99999 because relative frequency of 1 produces Not a Number (NaN) or Negative Infinity as scores in hyper-geometric models (DPH, DLH and DLH13).- Parameters:
tf- raw term frequencydocLength- length of the document- Returns:
- relative term frequency
-
tf_robertson
public static double tf_robertson(double tf, double b, double dl, double avdl, double k1)Returns a concave pivot length normalized tf as described in Robertson et al., 1999. Okapi at TREC-7: automatic ad hoc, filtering, VLC and filtering tracks. In Proceedings of the Seventh Text REtrieval Conference (TREC-7), pages 253-264- Parameters:
tf- the term frequency to modifyb- the slopedl- the document lengthavdl- the average document length in the collectionk1- the concavity coefficient- Returns:
- a concave pivot length normalized tf
-
tf_cornell
public static double tf_cornell(double tf, double s, double dl, double avdl)Returns a concave pivot length normalized tf as described in Singhal et al., 1999. AT&T at TREC-7. In Proceedings of the Seventh Text REtrieval Conference (TREC-7), pages 239-252.- Parameters:
tf- the term frequency to modifys- the slopedl- the document lengthavdl- the average document length in the collection- Returns:
- a concave pivot length normalized tf
-
stirlingPower
public static double stirlingPower(double n, double m)This method provides the contract for implementing the Stirling formula for the power series.- Parameters:
n- The parameter of the Stirling formula.m- The parameter of the Stirling formula.- Returns:
- the approximation of the power series
-
-