Class NoDuplicatesSinglePassIndexing


  • public class NoDuplicatesSinglePassIndexing
    extends BasicSinglePassIndexer
    Single pass indexer that performs document deduplication based upon the the docno.
    Since:
    4.0
    Author:
    Dyaa Albakour
    • Field Detail

      • seenDocnos

        protected java.util.TreeSet<java.lang.String> seenDocnos
    • Constructor Detail

      • NoDuplicatesSinglePassIndexing

        protected NoDuplicatesSinglePassIndexing​(long a,
                                                 long b,
                                                 long c)
      • NoDuplicatesSinglePassIndexing

        public NoDuplicatesSinglePassIndexing​(java.lang.String pathname,
                                              java.lang.String prefix)
    • Method Detail

      • indexDocument

        protected void indexDocument​(java.util.Map<java.lang.String,​java.lang.String> docProperties,
                                     DocumentPostingList termsInDocument)
                              throws java.lang.Exception
        This adds a document to the direct and document indexes, as well as it's terms to the lexicon. Handled internally by the methods indexFieldDocument and indexNoFieldDocument.. This implementation only places content in the runs in memory, which will eventually be flushed to disk.. This implementation only places content in the runs in memory, which will eventually be flushed to disk.
        Overrides:
        indexDocument in class BasicSinglePassIndexer
        Parameters:
        docProperties - Map<String,String> properties of the document
        termsInDocument - DocumentPostingList the terms in the document.
        Throws:
        java.lang.Exception
      • indexEmpty

        protected void indexEmpty​(java.util.Map<java.lang.String,​java.lang.String> docProperties)
                           throws java.io.IOException
        Adds an entry to document index for empty document @param docid, only if IndexEmptyDocuments is set to true.
        Overrides:
        indexEmpty in class Indexer
        Throws:
        java.io.IOException