Please consider the scenario.
Existing System
- I have an index named
contacts_index
with 100 documents. - Each document has property named
city
with some text value in it. - Index has
settings
as the following
{
"analyzer": {
"city_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "city_tokenizer"
},
"search_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "keyword"
}
},
"tokenizer": {
"city_tokenizer": {
"token_chars": [
"letter"
],
"min_gram": "2",
"type": "ngram",
"max_gram": "30"
}
}
}
- The index has the following mapping for city field to support
matching sub-text search
.
{
"city" : {
"type" : "text",
"analyzer" : "city_analyzer",
"search_analyzer" : "search_analyzer"
}
}
Proposed System
Now we want to perform autocomplete on city field. for example for city with value Seattle
. We want to get the document when the user types s, se, sea, seat, seatt, seattl, seattle
but Only when they query with the above prefix text. For example not when they type eattle. etc..
We have planned to attain this with the help of one more multi-field for city property with different of type text and different analyzer.
To attain this we have done the following.
- Updated the settings to support autocomplete
PUT /staging-contacts-index-v4.0/_settings?preserve_existing=true
{
"analysis": {
"analyzer": {
"autocomplete_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "autocomplete_tokenizer"
}
},
"tokenizer": {
"autocomplete_tokenizer": {
"token_chars": [
"letter"
],
"min_gram": "1",
"type": "edge_ngram",
"max_gram": "100"
}
}
}
}
- Update the mapping of
city
field with multi-fieldautocomplete
to support autocomplete
{
"city" : {
"type" : "text",
"fields" : {
"autocomplete" : {
"type" : "text",
"analyzer" : "autocomplete_analyzer",
"search_analyzer" : "search_analyzer"
}
},
"analyzer" : "city_analyzer",
"search_analyzer" : "search_analyzer"
}
}
Findings
-
For
any new document
that will be newly created after updating autocomplete multi-field settings, autocomplete search is working as expected -
For
existing documents
, if the value of city fieldchanges
, for exampleseattle to chicago,
the document is fetched when making autocomplete search. -
We are planning to make use of update api to fetch and update the existing 100 documents so that autocomplete works for existing documents as well. However while trying to use the update api, we are getting
{"result" : "noop"}
And the autocomplete search is not working.
I can infer that since the values were not changing, elasticsearch not creating tokens for autocomplete field.
Question
From the research we have done, there are two options to make sure the existing 100 documents
can perform autocomplete search.
- Use Reindex api for existing 100 documents.
- Fetch all 100 documents and Use document Index api to update the existing 100 documents which will create all the tokens in the process.
Which option is preferable and why?
Thanks for taking time to read through.