Terrier Core

Remove PonteCroft language modelling

Details

  • Description:
    Hide
    The PonteCroft language modelling approach is supported in Terrier, but its use involves the creation of additional index structures. This model is seldom used by ourselves, and by the language modelling community. Terrier has support for Hiemstra's LM, and we have in the common package Dirichlet LM.

    It is believed that the framework is operational at present. However, it does not have any unit tests.

    The purpose of this issue is to have a discussion at whether this package is a strategic part to remain in Terrier long term, or whether it should be removed.

    There are three options relating to the framework:
     a. Remove it completely
     b. Move it to common package (where it may stagnate)
     c. Keep it.

    A pre-requisite for b & c are that we add some method for testing that it is functional.

    Please discuss.
    Show
    The PonteCroft language modelling approach is supported in Terrier, but its use involves the creation of additional index structures. This model is seldom used by ourselves, and by the language modelling community. Terrier has support for Hiemstra's LM, and we have in the common package Dirichlet LM. It is believed that the framework is operational at present. However, it does not have any unit tests. The purpose of this issue is to have a discussion at whether this package is a strategic part to remain in Terrier long term, or whether it should be removed. There are three options relating to the framework:  a. Remove it completely  b. Move it to common package (where it may stagnate)  c. Keep it. A pre-requisite for b & c are that we add some method for testing that it is functional. Please discuss.

Issue Links

Activity

Hide
Iadh Ounis added a comment - 16/Sep/09 1:36 PM

I agree that the Ponte-Croft model is hardly used. We never really used it, but more importantly it is hardly used in recent language modelling papers. In fact, the Hiemstra model is much more effective, and is more suitable as a QL baseline. Therefore, I agree that the presence of the Ponte-Croft model in the Terrier core is not really needed.

I'm however more inclined to move it from the core to a common package (where it can peacefully die --hummm, I meant stagnate), i.e. I vote for option (b) above. We never know: we might need it for something one day.

I agree that we need unit testing for it though.

Show
Iadh Ounis added a comment - 16/Sep/09 1:36 PM I agree that the Ponte-Croft model is hardly used. We never really used it, but more importantly it is hardly used in recent language modelling papers. In fact, the Hiemstra model is much more effective, and is more suitable as a QL baseline. Therefore, I agree that the presence of the Ponte-Croft model in the Terrier core is not really needed. I'm however more inclined to move it from the core to a common package (where it can peacefully die --hummm, I meant stagnate), i.e. I vote for option (b) above. We never know: we might need it for something one day. I agree that we need unit testing for it though.
Hide
Craig Macdonald added a comment - 29/Jan/10 3:58 PM

Resolved. (Though common version doesnt actually work)

Show
Craig Macdonald added a comment - 29/Jan/10 3:58 PM Resolved. (Though common version doesnt actually work)

People

Dates

  • Created:
    15/Sep/09 11:12 PM
    Updated:
    05/Mar/10 5:05 PM
    Resolved:
    29/Jan/10 3:58 PM