Hello!
I have an analyzer with extended_whitespace tokenizer and word_delimiter filter.
Something like:
"analysis": {
"tokenizer": {
"extended_whitespace": {
"type": "pattern",
"pattern": "\\s+"
}
},
"filter": {
"subword": {
"type": "word_delimiter",
"preserve_original": true,
"catenate_numbers": true
}
},
"analyzer": {
"search_fulltext_analyzer": {
"tokenizer": "extended_whitespace",
"filter": [
"subword",
"lowercase",
"filter_ascii_folding"
],
"char_filter": [
"url_encode_pattern"
]
},
}
My problem is that if i have a text like: "NODE_PATH=/usr/lib/node_modules ./upgrade-indices.js upgrades"
The highlight will be wrong:
"{highlight}NODE_PATH=/usr/lib/node{/highlight}_modules ./upgrade-indices.js upgrades"
Instead of
"{highlight}NODE{/highlight}_PATH=/usr/lib/{highlight}node{/highlight}_modules ./upgrade-indices.js upgrades"
I know it has something to do with the offsets. But I can't find a solution.
This is the analyze api result:
{
"tokens": [
{
"token": "node_path=/usr/lib/node_modules",
"start_offset": 0,
"end_offset": 31,
"type": "word",
"position": 0
},
{
"token": "node",
"start_offset": 0,
"end_offset": 4,
"type": "word",
"position": 0
},
{
"token": "path",
"start_offset": 5,
"end_offset": 9,
"type": "word",
"position": 1
},
{
"token": "usr",
"start_offset": 11,
"end_offset": 14,
"type": "word",
"position": 2
},
{
"token": "lib",
"start_offset": 15,
"end_offset": 18,
"type": "word",
"position": 3
},
{
"token": "node",
"start_offset": 19,
"end_offset": 23,
"type": "word",
"position": 4
},
{
"token": "modules",
"start_offset": 24,
"end_offset": 31,
"type": "word",
"position": 5
},
{
"token": "./upgrade-indices.js",
"start_offset": 32,
"end_offset": 52,
"type": "word",
"position": 6
},
{
"token": "upgrade",
"start_offset": 34,
"end_offset": 41,
"type": "word",
"position": 6
},
{
"token": "indices",
"start_offset": 42,
"end_offset": 49,
"type": "word",
"position": 7
},
{
"token": "js",
"start_offset": 50,
"end_offset": 52,
"type": "word",
"position": 8
},
{
"token": "upgrades",
"start_offset": 53,
"end_offset": 61,
"type": "word",
"position": 9
}
]
}