Accented and non-accented characters in Elasticsearch

Hi Team,

We need to produce both accented & non-accented characters for the non-accented searches in elasticsearch
EX:
for Amelie, the search should show Amelie as well as Amélie, Amèlie, Amélié, Ámelié, Àmelie

To reproduce this we have added the custom analyser as below

{
  "analysis" : {
    "analyzer":{          
         "accents_analyzer": {
         "tokenizer": "standard",
         "filter": ["lowercase", "asciifolding"]
       }          
    }
  }
}

This analyser is working for us, but we need to do

  1. close & open the index
  2. re-index with the new index creation

as both are involving the downtime,
is there any other possibility to do this without downtime

You can use an alias, reindex to the new index and switch the alias.

we are using this index in our production, re-indexing will cause downtime right ?

Not if you have an alias on top of your production index and are using this alias to query the data.

Let say you have an alias named prod which is directing the data to prod-v1.

So your app is calling:

POST /prod/_doc
{
  "foo": "bar"
}
GET /prod/_search

You can run a reindex operation on the new index named prod-v2:

PUT /prod-v2
{
  // Settings and mappings go there
}
POST _reindex
{
  "source": {
    "index": "prod-v1"
  },
  "dest": {
    "index": "prod-v2"
  }
}

Once done, switch the alias:

POST _aliases
{
  "actions": [
    {
      "remove": {
        "index": "prod-v1",
        "alias": "prod"
      }
    },
    {
      "add": {
        "index": "prod-v2",
        "alias": "prod"
      }
    }
  ]
}

Once done, you can still call:

POST /prod/_doc
{
  "foo": "bar"
}
GET /prod/_search

Remove the old index:

DELETE prod-v1