Terrier Core

BitPostingIndexInputFormat tries to use negative offsets

Details

  • Type: Bug Bug
  • Status: Resolved Resolved
  • Priority: Blocker Blocker
  • Resolution: Fixed
  • Affects Version/s: 3.0
  • Fix Version/s: 3.0
  • Component/s: .structures
  • Description:
    Hide
    INFO - Calculating splits of structure inverted
    INFO - File 0 approx splits=37.04791417717934
    INFO - File 1 approx splits=11.056029200553894
    INFO - File 2 approx splits=19.52261757850647
    INFO - File 3 approx splits=11.2651377171278
    INFO - File 4 approx splits=8.67573507130146
    INFO - File 5 approx splits=9.554178163409233
    INFO - File 6 approx splits=6.9783875644207
    INFO - File 7 approx splits=8.02258075773716
    INFO - File 8 approx splits=7.250798925757408
    INFO - File 9 approx splits=3.25227153301239
    INFO - File 10 approx splits=3.524603247642517
    INFO - File 11 approx splits=8.75064267218113
    INFO - File 12 approx splits=12.29665707051754
    INFO - File 13 approx splits=6.377697631716728
    INFO - File 14 approx splits=4.812018930912018
    INFO - File 15 approx splits=15.15673378109932
    INFO - File 16 approx splits=1.0457783639431
    INFO - File 17 approx splits=10.512042790651321
    INFO - File 18 approx splits=21.401717394590378
    INFO - File 19 approx splits=10.833241820335388
    INFO - File 20 approx splits=3.486497238278389
    INFO - File 21 approx splits=4.725020796060562
    INFO - File 22 approx splits=7.931180149316788
    INFO - File 23 approx splits=0.6753774285316467
    INFO - File 24 approx splits=1.1868641674518585
    INFO - File 25 approx splits=0.801939070224762
    Exception in thread "main" org.apache.hadoop.ipc.RemoteException: java.io.IOException: Negative offset is not supported. File: /Indices/ClueWeb09/TREC-B/classical/data.inverted.bf25
            at org.apache.hadoop.dfs.FSNamesystem.getBlockLocations(FSNamesystem.java:722)
            at org.apache.hadoop.dfs.FSNamesystem.getBlockLocations(FSNamesystem.java:703)
            at org.apache.hadoop.dfs.NameNode.getBlockLocations(NameNode.java:257)
            at sun.reflect.GeneratedMethodAccessor108.invoke(Unknown Source)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
            at java.lang.reflect.Method.invoke(Method.java:597)
            at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
            at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)

            at org.apache.hadoop.ipc.Client.call(Client.java:715)
            at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
            at org.apache.hadoop.dfs.$Proxy0.getBlockLocations(Unknown Source)
            at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
            at java.lang.reflect.Method.invoke(Method.java:597)
            at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
            at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
            at org.apache.hadoop.dfs.$Proxy0.getBlockLocations(Unknown Source)
            at org.apache.hadoop.dfs.DFSClient.callGetBlockLocations(DFSClient.java:297)
            at org.apache.hadoop.dfs.DFSClient.getBlockLocations(DFSClient.java:318)
            at org.apache.hadoop.dfs.DistributedFileSystem.getFileBlockLocations(DistributedFileSystem.java:137)
            at org.terrier.structures.indexing.singlepass.hadoop.BitPostingIndexInputFormat.getSplits(BitPostingIndexInputFormat.java:233)
            at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)
            at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1026)
            at org.terrier.structures.indexing.singlepass.hadoop.Inv2DirectMultiReduce$Inv2DirectMultiReduceJob.runJob(Inv2DirectMultiReduce.java:166)
            at org.terrier.structures.indexing.singlepass.hadoop.Inv2DirectMultiReduce.invertStructure(Inv2DirectMultiReduce.java:338)
            at org.terrier.structures.indexing.singlepass.hadoop.Inv2DirectMultiReduce.main(Inv2DirectMultiReduce.java:282)
    Show
    INFO - Calculating splits of structure inverted INFO - File 0 approx splits=37.04791417717934 INFO - File 1 approx splits=11.056029200553894 INFO - File 2 approx splits=19.52261757850647 INFO - File 3 approx splits=11.2651377171278 INFO - File 4 approx splits=8.67573507130146 INFO - File 5 approx splits=9.554178163409233 INFO - File 6 approx splits=6.9783875644207 INFO - File 7 approx splits=8.02258075773716 INFO - File 8 approx splits=7.250798925757408 INFO - File 9 approx splits=3.25227153301239 INFO - File 10 approx splits=3.524603247642517 INFO - File 11 approx splits=8.75064267218113 INFO - File 12 approx splits=12.29665707051754 INFO - File 13 approx splits=6.377697631716728 INFO - File 14 approx splits=4.812018930912018 INFO - File 15 approx splits=15.15673378109932 INFO - File 16 approx splits=1.0457783639431 INFO - File 17 approx splits=10.512042790651321 INFO - File 18 approx splits=21.401717394590378 INFO - File 19 approx splits=10.833241820335388 INFO - File 20 approx splits=3.486497238278389 INFO - File 21 approx splits=4.725020796060562 INFO - File 22 approx splits=7.931180149316788 INFO - File 23 approx splits=0.6753774285316467 INFO - File 24 approx splits=1.1868641674518585 INFO - File 25 approx splits=0.801939070224762 Exception in thread "main" org.apache.hadoop.ipc.RemoteException: java.io.IOException: Negative offset is not supported. File: /Indices/ClueWeb09/TREC-B/classical/data.inverted.bf25         at org.apache.hadoop.dfs.FSNamesystem.getBlockLocations(FSNamesystem.java:722)         at org.apache.hadoop.dfs.FSNamesystem.getBlockLocations(FSNamesystem.java:703)         at org.apache.hadoop.dfs.NameNode.getBlockLocations(NameNode.java:257)         at sun.reflect.GeneratedMethodAccessor108.invoke(Unknown Source)         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)         at java.lang.reflect.Method.invoke(Method.java:597)         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)         at org.apache.hadoop.ipc.Client.call(Client.java:715)         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)         at org.apache.hadoop.dfs.$Proxy0.getBlockLocations(Unknown Source)         at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)         at java.lang.reflect.Method.invoke(Method.java:597)         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)         at org.apache.hadoop.dfs.$Proxy0.getBlockLocations(Unknown Source)         at org.apache.hadoop.dfs.DFSClient.callGetBlockLocations(DFSClient.java:297)         at org.apache.hadoop.dfs.DFSClient.getBlockLocations(DFSClient.java:318)         at org.apache.hadoop.dfs.DistributedFileSystem.getFileBlockLocations(DistributedFileSystem.java:137)         at org.terrier.structures.indexing.singlepass.hadoop.BitPostingIndexInputFormat.getSplits(BitPostingIndexInputFormat.java:233)         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742)         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1026)         at org.terrier.structures.indexing.singlepass.hadoop.Inv2DirectMultiReduce$Inv2DirectMultiReduceJob.runJob(Inv2DirectMultiReduce.java:166)         at org.terrier.structures.indexing.singlepass.hadoop.Inv2DirectMultiReduce.invertStructure(Inv2DirectMultiReduce.java:338)         at org.terrier.structures.indexing.singlepass.hadoop.Inv2DirectMultiReduce.main(Inv2DirectMultiReduce.java:282)

Activity

Hide
Craig Macdonald added a comment - 02/Mar/10 1:10 PM - edited

This was an issue created by earlier private issue. I have totally reworked the algorithm to create the splits. Empriical evidence suggests this works as expected now.

Show
Craig Macdonald added a comment - 02/Mar/10 1:10 PM - edited This was an issue created by earlier private issue. I have totally reworked the algorithm to create the splits. Empriical evidence suggests this works as expected now.

People

Dates

  • Created:
    25/Feb/10 7:05 PM
    Updated:
    05/Mar/10 5:37 PM
    Resolved:
    02/Mar/10 1:10 PM