Terrier Core

Lexicon not properly renamed on Windows, multipass indexing

Details

  • Type: Bug Bug
  • Status: Resolved Resolved
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: 3.0
  • Fix Version/s: 3.5
  • Component/s: .indexing, .structures
  • Description:
    Hide
    see http://terrier.org/forum//read.php?3,1493

    problem is that data_1.lexicon.fsomapfile and data_1.tmplexicon.fsomapfile are not properly renamed, probably because a file is left opened somewhere. An inspection of InvertedIndexBuilder suggests the problem is not here.
    Show
    see http://terrier.org/forum//read.php?3,1493 problem is that data_1.lexicon.fsomapfile and data_1.tmplexicon.fsomapfile are not properly renamed, probably because a file is left opened somewhere. An inspection of InvertedIndexBuilder suggests the problem is not here.

Activity

Hide
Craig Macdonald added a comment - 29/Apr/10 9:59 AM

Line 713 & 714 of LexiconBuilder should close lis1 not lis2. However, this method normally lexicon in pairs, while normally N lexicon are merged at once (an obscure property controls this). I'm not sure this is the cause of the issue yet.

Show
Craig Macdonald added a comment - 29/Apr/10 9:59 AM Line 713 & 714 of LexiconBuilder should close lis1 not lis2. However, this method normally lexicon in pairs, while normally N lexicon are merged at once (an obscure property controls this). I'm not sure this is the cause of the issue yet.
Hide
Richard McCreadie added a comment - 14/May/10 10:42 AM

Tried fixing the wring close in the lexicon builder, no joy. Failed move only occurs which inverted indexing (i.e. -i or -i -v). Still fails when only a single lexicon is around to merge.

Show
Richard McCreadie added a comment - 14/May/10 10:42 AM Tried fixing the wring close in the lexicon builder, no joy. Failed move only occurs which inverted indexing (i.e. -i or -i -v). Still fails when only a single lexicon is around to merge.
Hide
Richard McCreadie added a comment - 14/May/10 2:49 PM

data.tmplexicon.fsomapfile is closed correctly. However, there remain 2 open RW- handles on data.lexicon.fsomapfile around line 381 of InvertedIndexBuilder. This causes FSOMapFileLexicion.deleteMapFileLexicon to not delete data.lexicon.fsomapfile and FSOMapFileLexicion.renameMapFileLexicon to fail.

Test case: Windows 7 64bit, trec_terrier.bat -i -v (building from existing direct file)

Show
Richard McCreadie added a comment - 14/May/10 2:49 PM data.tmplexicon.fsomapfile is closed correctly. However, there remain 2 open RW- handles on data.lexicon.fsomapfile around line 381 of InvertedIndexBuilder. This causes FSOMapFileLexicion.deleteMapFileLexicon to not delete data.lexicon.fsomapfile and FSOMapFileLexicion.renameMapFileLexicon to fail. Test case: Windows 7 64bit, trec_terrier.bat -i -v (building from existing direct file)
Hide
Craig Macdonald added a comment - 14/May/10 4:58 PM

Found problem. line 188 in InvertedIndexBuilder.java:
replace

int numberOfUniqueTerms = index.getLexicon().numberOfEntries();

with

int numberOfUniqueTerms = index.getCollectionStatistics().getNumberOfUniqueTerms();
Show
Craig Macdonald added a comment - 14/May/10 4:58 PM Found problem. line 188 in InvertedIndexBuilder.java: replace
int numberOfUniqueTerms = index.getLexicon().numberOfEntries();
with
int numberOfUniqueTerms = index.getCollectionStatistics().getNumberOfUniqueTerms();
Hide
Craig Macdonald added a comment - 14/May/10 4:58 PM

Fix committed to trunk.

Show
Craig Macdonald added a comment - 14/May/10 4:58 PM Fix committed to trunk.

People

Dates

  • Created:
    29/Apr/10 9:41 AM
    Updated:
    14/May/10 4:58 PM
    Resolved:
    14/May/10 4:58 PM