Terrier Core

MapReduce InputFormat for BitPostingIndexInputStream

Details

  • Type: Improvement Improvement
  • Status: Resolved Resolved
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: 3.0
  • Fix Version/s: 3.0
  • Component/s: .structures
  • Description:
    Recent core changes have made generic PostingIndex and PostingIndexInputStream objects which give access to IterablePostings. It would be good to have an InputFormat for splitting the reading of a PostingIndex across various map tasks.
  1. BitPostingIndexInputFormat.v1.patch
    (13 kB)
    Craig Macdonald
    07/May/09 8:16 PM
  2. BitPostingIndexInputFormat.v2.patch
    (15 kB)
    Craig Macdonald
    08/May/09 10:00 AM

Activity

Hide
Craig Macdonald added a comment - 07/May/09 8:15 PM

This would have benefits in the following scenarios:

  • If we use PostingIndex as a LinkServer, then this would allow link analysis index processing to be easily split
  • Inversion of indices (i.e. Inverted-> Direct; Direct->Inverted) could be split to run in parallel
  • DirectIndex analysis.

As a PostingIndex can be large, locality should be supported.

Show
Craig Macdonald added a comment - 07/May/09 8:15 PM This would have benefits in the following scenarios:
  • If we use PostingIndex as a LinkServer, then this would allow link analysis index processing to be easily split
  • Inversion of indices (i.e. Inverted-> Direct; Direct->Inverted) could be split to run in parallel
  • DirectIndex analysis.
As a PostingIndex can be large, locality should be supported.
Hide
Craig Macdonald added a comment - 07/May/09 8:16 PM

Initial version, untested.

Show
Craig Macdonald added a comment - 07/May/09 8:16 PM Initial version, untested.
Hide
Craig Macdonald added a comment - 08/May/09 10:00 AM

Updated version - some files were missing from the patch.

Show
Craig Macdonald added a comment - 08/May/09 10:00 AM Updated version - some files were missing from the patch.
Hide
Craig Macdonald added a comment - 16/Jul/09 7:13 PM

Resolved a final version to trunk.

Show
Craig Macdonald added a comment - 16/Jul/09 7:13 PM Resolved a final version to trunk.

People

Dates

  • Created:
    07/May/09 8:14 PM
    Updated:
    05/Mar/10 4:47 PM
    Resolved:
    16/Jul/09 7:13 PM