Package org.terrier.querying
Class Decorate
- java.lang.Object
-
- org.terrier.querying.Decorate
-
- All Implemented Interfaces:
PostFilter,Process
public class Decorate extends java.lang.Object implements Process, PostFilter
This class decorates a result set with metadata. This metadata can be highlighted, can have a query biased summary created, and also be escaped for display in another format. Controls:- summaries - comma or semicolon delimited list of the key names for which a query biased summary should be created. e.g. summaries:snippet
- emphasis - comma or semicolon delimited list of they key names that should have boldened for occurrences of the query terms. e.g. emphasis:title;snippet
- earlyDecorate - comma or semicolon delimited list of the key names that should be decorated early, e.g. to support another PostProcess using them.
- escape - comma or semicolon delimited list of the key names that should be escaped e.g. escape:title;snippet;url. Currently, per-key type escaping is not supported. The default escape type is defined using the property decorate.escape.
- decorate.escape - default escape type for metadata. Default is HTML. Possible escape types include XML, JAVASCRIPT, and URL. See utility.StringTools.ESCAPE
- Since:
- 3.0
- Author:
- Craig Macdonald, Vassilis Plachouras, Ben He
-
-
Field Summary
Fields Modifier and Type Field Description protected static java.util.regex.PatterncleanQueryprotected static java.lang.String[]CONTROL_VALUE_DELIMSdelimiters for breaking down the values of controls furtherprotected static java.util.regex.PatterncontrolNonVisibleCharactersprotected java.util.regex.MatchercontrolNonVisibleCharactersMatcherprotected static StringTools.ESCAPEdefaultEscapewhat is the default escape sequenceprotected java.util.Set<java.lang.String>earlyKeysprotected java.util.Set<java.lang.String>emphasisKeysprotected java.util.Map<java.lang.String,StringTools.ESCAPE>escapeKeysprotected java.util.regex.Patternhighlighthighlighting pattern for the current queryprotected gnu.trove.TObjectIntHashMap<java.lang.String>keysprotected LRUMap<java.lang.Integer,java.lang.String[]>metaCacheThe cache used for the meta data.protected MetaIndexmetaIndexThe meta index server.protected java.lang.String[]metaKeysprotected java.lang.String[]qTermsquery terms of the current queryprotected Summarisersummariserprotected java.util.Set<java.lang.String>summaryKeys-
Fields inherited from interface org.terrier.querying.PostFilter
FILTER_ADJUSTED, FILTER_OK, FILTER_REMOVE
-
-
Constructor Summary
Constructors Constructor Description Decorate()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected booleancheckControl(java.lang.String control_name, SearchRequest srq)bytefilter(Manager m, SearchRequest q, ResultSet rs, int rank, int docid)Called for each result in the resultset, used to filter out unwanted results.protected java.util.regex.PatterngenerateEmphasisPattern(java.lang.String[] _qTerms)Creates a regular expression pattern to highlight query terms metadata.java.lang.StringgetInfo()Returns the name of the post processor.protected java.lang.String[]getMetadata(java.lang.String[] metaKeys, int docid)protected java.lang.String[][]getMetadata(java.lang.String[] metaKeys, int[] docids)voidnew_query(Manager m, SearchRequest q, ResultSet rs)Called before the processing of a resultset using this PostFilter is applied.voidprocess(Manager manager, Request q)decoration at the postprocess stage.
-
-
-
Field Detail
-
CONTROL_VALUE_DELIMS
protected static final java.lang.String[] CONTROL_VALUE_DELIMS
delimiters for breaking down the values of controls further
-
metaCache
protected LRUMap<java.lang.Integer,java.lang.String[]> metaCache
The cache used for the meta data. Implements a Least-Recently-Used policy for retaining the most recently accessed metadata.
-
metaIndex
protected MetaIndex metaIndex
The meta index server. It is provided by the manager.
-
controlNonVisibleCharacters
protected static final java.util.regex.Pattern controlNonVisibleCharacters
-
defaultEscape
protected static final StringTools.ESCAPE defaultEscape
what is the default escape sequence
-
controlNonVisibleCharactersMatcher
protected java.util.regex.Matcher controlNonVisibleCharactersMatcher
-
cleanQuery
protected static final java.util.regex.Pattern cleanQuery
-
highlight
protected java.util.regex.Pattern highlight
highlighting pattern for the current query
-
qTerms
protected java.lang.String[] qTerms
query terms of the current query
-
keys
protected gnu.trove.TObjectIntHashMap<java.lang.String> keys
-
summaryKeys
protected java.util.Set<java.lang.String> summaryKeys
-
emphasisKeys
protected java.util.Set<java.lang.String> emphasisKeys
-
escapeKeys
protected java.util.Map<java.lang.String,StringTools.ESCAPE> escapeKeys
-
earlyKeys
protected java.util.Set<java.lang.String> earlyKeys
-
summariser
protected Summariser summariser
-
metaKeys
protected java.lang.String[] metaKeys
-
-
Method Detail
-
new_query
public void new_query(Manager m, SearchRequest q, ResultSet rs)
Called before the processing of a resultset using this PostFilter is applied. Can be used to save information for the duration of the query.- Specified by:
new_queryin interfacePostFilter- Parameters:
m- The manager controlling this queryq- The search request being processedrs- the resultset that is being iterated through
-
filter
public byte filter(Manager m, SearchRequest q, ResultSet rs, int rank, int docid)
Called for each result in the resultset, used to filter out unwanted results.- Specified by:
filterin interfacePostFilter- Parameters:
m- The manager controlling this queryq- The search request being processedrank- which array index (rank) in the resultset have we reacheddocid- The docid of the currently being procesed result.
-
process
public void process(Manager manager, Request q)
decoration at the postprocess stage. only decorate if required for future postfilter or postprocesses.
-
getMetadata
protected java.lang.String[] getMetadata(java.lang.String[] metaKeys, int docid)
-
getMetadata
protected java.lang.String[][] getMetadata(java.lang.String[] metaKeys, int[] docids)
-
generateEmphasisPattern
protected java.util.regex.Pattern generateEmphasisPattern(java.lang.String[] _qTerms)
Creates a regular expression pattern to highlight query terms metadata.- Parameters:
_qTerms- query terms- Returns:
- Pattern to apply
-
checkControl
protected boolean checkControl(java.lang.String control_name, SearchRequest srq)
-
-