Hello,
I was reading the Relevant Search book and particularly the sentinel token technique, also described here.
Following the book recipe, the easiest way I could achieve an exact match is by creating a new field in my document so that:
{
"name": "big blue rental car",
"nameWithSentinels": "sentinel_begin big blue rental car sentinel_end"
}
and use a match_phrase
query against the nameWithSentinels
field with sentinel_begin <user_search> sentinel_end
as a value.
However, I do wonder if there's a way a can achieve this without having to create the nameWithSentinels
field but leveraging the creation of token filters.
I'm thinking of creating a filters that given the following tokens ["big", "blue", "rental", "car"]
would be capable of returning ["sentinel_begin", "big", "blue", "rental", "car", "sentinel_end"]
or ["sentinel_begin big", "blue", "rental", "car sentinel_end"]
(I believe the same match_phrase
query presented above would still work )
So far my settings looks like this :
{
"idx_0001": {
"settings": {
// stuff removed for brevety
"analysis": {
"filter": {
"sentinel_border_condition_end": {
"filter": [
"sentinel_border_end"
],
"type": "condition",
"script": {
"source": "token.getPosition() === [NO IDEA HOW I CAN GET THE NUMBER OF TOKENS]"
}
},
"sentinel_border_begin": {
"pattern": "^",
"type": "pattern_replace",
"replacement": "SENTINEL_BEGIN"
},
{
"sentinel_border_end": {
"pattern": "$",
"type": "pattern_replace",
"replacement": "SENTINEL_END"
},
"sentinel_border_condition_begin": {
"filter": [
"sentinel_border_begin"
],
"type": "condition",
"script": {
"source": "token.getPosition() === 0"
}
}
},
"analyzer": {
"my-analyzer": {
"filter": [
"sentinel_border_condition_begin",
"sentinel_border_condition_end",
"lowercase",
"stop"
],
"type": "custom",
"tokenizer": "standard"
}
}
}
}
}
}
But as you can see I'm not sure of what would be the painless sciript for the sentinel_border_condition_end
token filter.
Last resort would be creating a plugin but the documentation about token filters plugin is not large from what I can see.
Thank you in advance,
A.