Terrier Core

2Way StructureMerger - produces too large termids

Details

  • Type: Bug Bug
  • Status: Resolved Resolved
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: 3.0
  • Fix Version/s: 3.0
  • Component/s: .structures
  • Description:
    Hide
    junit.framework.AssertionFailedError: Got too big a termid (3867) from direct index input stream, numTerms=2361
    at junit.framework.Assert.fail(Assert.java:47)
    at junit.framework.Assert.assertTrue(Assert.java:20)
    at uk.ac.gla.terrier.tests.ShakespeareEndToEndTest.checkDirectIndex(ShakespeareEndToEndTest.java:219)
    at uk.ac.gla.terrier.tests.ShakespeareEndToEndTest.checkIndex(ShakespeareEndToEndTest.java:285)
    at uk.ac.gla.terrier.tests.BatchEndToEndTest.doTrecTerrierIndexingRunAndEvaluate(BatchEndToEndTest.java:157)
    at uk.ac.gla.terrier.tests.BasicShakespeareEndToEndTest.testBasicClassical(BasicShakespeareEndToEndTest.java:19)
    Show
    junit.framework.AssertionFailedError: Got too big a termid (3867) from direct index input stream, numTerms=2361 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.assertTrue(Assert.java:20) at uk.ac.gla.terrier.tests.ShakespeareEndToEndTest.checkDirectIndex(ShakespeareEndToEndTest.java:219) at uk.ac.gla.terrier.tests.ShakespeareEndToEndTest.checkIndex(ShakespeareEndToEndTest.java:285) at uk.ac.gla.terrier.tests.BatchEndToEndTest.doTrecTerrierIndexingRunAndEvaluate(BatchEndToEndTest.java:157) at uk.ac.gla.terrier.tests.BasicShakespeareEndToEndTest.testBasicClassical(BasicShakespeareEndToEndTest.java:19)

Activity

Hide
Craig Macdonald added a comment - 09/Sep/09 9:43 AM

Problem is that in the inverted merging phase, old-termid -> new-termid mappings are produced for use when the direct index is being merged. However, the new termids may not have the same ordering as the old termids, so postings from the second direct file need to be reordered when being written to the first direct file.

Show
Craig Macdonald added a comment - 09/Sep/09 9:43 AM Problem is that in the inverted merging phase, old-termid -> new-termid mappings are produced for use when the direct index is being merged. However, the new termids may not have the same ordering as the old termids, so postings from the second direct file need to be reordered when being written to the first direct file.
Hide
Craig Macdonald added a comment - 09/Sep/09 9:44 AM - edited

Fix committed to trunk. Very easy once the penny has dropped!

Show
Craig Macdonald added a comment - 09/Sep/09 9:44 AM - edited Fix committed to trunk. Very easy once the penny has dropped!
Hide
Craig Macdonald added a comment - 09/Sep/09 9:46 AM

Meant to add that this issue is checked by the end-to-end tests.

Show
Craig Macdonald added a comment - 09/Sep/09 9:46 AM Meant to add that this issue is checked by the end-to-end tests.

People

Dates

  • Created:
    07/Sep/09 7:40 PM
    Updated:
    05/Mar/10 5:03 PM
    Resolved:
    09/Sep/09 9:44 AM