Hi All,
I am trying to upgrade from Elastic 5 to Elastic 6.
I have a very simple index which uses shingles, word_delimiter and edge.
My example is noted below, I am trying to index "Bat man is cool".
I want to be able to allow customers to search for "batman" so I am using word delimiter and if they enter "batm" I want to be able to match this, so I also use edge grams.
The below works in ES5, but in ES6, it no longer works.
PUT /test
{
"settings": {
"analysis": {
"filter": {
"word_joiner": {
"type": "word_delimiter",
"catenate_all": true
},
"edge": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 10
},
"remove_plurals": {
"type": "stemmer",
"name": "minimal_english"
},
"shingles": {
"type": "shingle",
"max_shingle_size": 5,
"min_shingle_size": 2,
"output_unigrams": "true"
}
},
"analyzer": {
"default": {
"tokenizer": "standard",
"filter": [
"lowercase",
"shingles",
"word_joiner",
"edge"
]
}
}
}
},
"mappings": {
"test": {
"properties": {
"field1": {
"type": "text",
"analyzer": "default"
}
}
}
}
}
POST /_bulk
{ "update" : { "_index" : "test", "_type" : "test", "_id" : "1" } }
{ "doc": { "field1" : "Bat man is cool"}, "doc_as_upsert": true}
Error:
startOffset must be non-negative, and endOffset must be >= startOffset, and offsets must not go backwards startOffset=0,endOffset=3,lastStartOffset=4 for field 'field1'
Can anyone suggest a reason or help?
You can enter the code above (the PUT and POST) on ES6 and recreate the issue very quickly.
Thank you for your help in advance,
Dev