Class BasicTermStatsLexiconEntry

  • All Implemented Interfaces:
    java.io.Serializable, org.apache.hadoop.io.Writable, EntryStatistics, Pointer

    public class BasicTermStatsLexiconEntry
    extends LexiconEntry
    A LexiconEntry which only contains EntryStatistics
    Author:
    Craig Macdonald
    See Also:
    Serialized Form
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected int maxtf  
      protected int n_t  
      protected int termId  
      protected int TF  
    • Constructor Summary

      Constructors 
      Constructor Description
      BasicTermStatsLexiconEntry()
      Constructs an instance of the BasicTermStatsLexiconEntry.
      BasicTermStatsLexiconEntry​(int _TF, int _n_t, int _termId)
      Constructs an instance of the BasicTermStatsLexiconEntry.
    • Field Detail

      • n_t

        protected int n_t
      • TF

        protected int TF
      • termId

        protected int termId
      • maxtf

        protected int maxtf
    • Constructor Detail

      • BasicTermStatsLexiconEntry

        public BasicTermStatsLexiconEntry()
        Constructs an instance of the BasicTermStatsLexiconEntry.
      • BasicTermStatsLexiconEntry

        public BasicTermStatsLexiconEntry​(int _TF,
                                          int _n_t,
                                          int _termId)
        Constructs an instance of the BasicTermStatsLexiconEntry.
        Parameters:
        _TF -
        _n_t -
        _termId -
    • Method Detail

      • getDocumentFrequency

        public int getDocumentFrequency()
        Return the number of documents that the term occurs in.
        Returns:
        the number of documents that the term occurs in.
      • getMaxFrequencyInDocuments

        public int getMaxFrequencyInDocuments()
        Description copied from interface: EntryStatistics
        Return the maximum in-document term frequency of the term among all documents the terms appears in.
        Returns:
        the maximum in-document term frequency of the term among all documents the terms appears in.
      • setMaxFrequencyInDocuments

        public void setMaxFrequencyInDocuments​(int max)
        Description copied from interface: EntryStatistics
        Set the maximum in-document term frequency of the term among all documents the terms appears in.
        Parameters:
        max - the maximum in-document term frequency of the term among all documents the terms appears in.
      • getFrequency

        public int getFrequency()
        Return the frequency (total number of occurrences) of the term.
        Returns:
        the frequency (total number of occurrences) of the entry (term).
      • getTermId

        public int getTermId()
        Return the id of the term.
        Returns:
        the id of the term.
      • setTermId

        public void setTermId​(int _termId)
        Set the term ID, the integer representation of the term in the index, e.g. as used in direct index posting structures.
        Specified by:
        setTermId in class LexiconEntry
      • setAll

        public void setAll​(int _TF,
                           int _n_t,
                           int _termId)
        Sets the term frequency, document frequency and term id for this term
      • getNumberOfEntries

        public int getNumberOfEntries()
        Pointer implementation: how many entries in the inverted index. Usually the same as getDocumentFrequency().
        Specified by:
        getNumberOfEntries in interface Pointer
        Overrides:
        getNumberOfEntries in class LexiconEntry
        Returns:
        the number of "things" that this pointer refers to.
      • getOffsetBits

        public byte getOffsetBits()
        Get the number of bits for the offset
      • getOffset

        public long getOffset()
        Get the offset (bytes)
      • setOffset

        public void setOffset​(long bytes,
                              byte bits)
        Set the offset in terms of bits and bytes
      • setBitIndexPointer

        public void setBitIndexPointer​(BitIndexPointer pointer)
        Sets the bit index pointer to this LexiconEntry
      • setOffset

        public void setOffset​(BitFilePosition pos)
        Sets the offset using a BitFilePosition
      • readFields

        public void readFields​(java.io.DataInput in)
                        throws java.io.IOException
        Throws:
        java.io.IOException
      • write

        public void write​(java.io.DataOutput out)
                   throws java.io.IOException
        Throws:
        java.io.IOException
      • add

        public void add​(EntryStatistics le)
        Increment the statistics of this object by that of another.
        Parameters:
        le - the other object whose statistics are used to increment the statistics of this object.
      • subtract

        public void subtract​(EntryStatistics le)
        Decrement the statistics of this object by that of another.
        Parameters:
        le - the other object whose statistics are used to decrement the statistics of this object.
      • setNumberOfEntries

        public void setNumberOfEntries​(int n)
        Update the number of entries in the pointer
        Specified by:
        setNumberOfEntries in interface Pointer
        Overrides:
        setNumberOfEntries in class LexiconEntry
        Parameters:
        n - the number of "things" that the pointer refers to.
      • setPointer

        public void setPointer​(Pointer p)
        Update the pointer
        Specified by:
        setPointer in interface Pointer
        Overrides:
        setPointer in class LexiconEntry
        Parameters:
        p - other pointer to update the pointer in this object.
      • setStatistics

        public void setStatistics​(int _n_t,
                                  int _TF)
        Description copied from class: LexiconEntry
        Update the document frequency and term frequency
        Specified by:
        setStatistics in class LexiconEntry
      • setFrequency

        public void setFrequency​(int F)
        Description copied from interface: EntryStatistics
        Set the frequency (total number of occurrences) of the term.
        Parameters:
        F - the frequency (total number of occurrences) of the entry (term).
      • setDocumentFrequency

        public void setDocumentFrequency​(int nt)
        Sets the document frequency
        Parameters:
        nt - the number of documents that the term occurs in.