Elasticsearch Became case sensitive after add synonym analyzer

After I added synonym analyzer to my_index, the index became case-sensitive

I have one property called nationality that has synonym analyzer. But it seems that this property become case sensitive because of the synonym analyzer.

Here is my /my_index/_mappings

{
  "my_index": {
    "mappings": {
      "items": {
        "properties": {
          .
          .
          .
          "nationality": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            },
            "analyzer": "synonym"
          },
          .
          .
          .
        }
      }
    }
  }
}

Inside the index, i have word India COUNTRY. When I try to search India nation using the command below, I will get the result.

POST /my_index/_search
{
  "query": {
    "match": {
      "nationality": "India nation"
    }
  }
}

But, when I search for india (notice the letter i is lowercase), I will get nothing.
My assumption is, this happend because i put uppercase filter before the synonym. I did this because the synonyms are uppercased. So the query India will be INDIA after pass through this filter.

Here is my /my_index/_settings

{
  "my_index": {
    "settings": {
      "index": {
        "number_of_shards": "1",
        "provided_name": "my_index",
        "similarity": {
          "default": {
            "type": "BM25",
            "b": "0.9",
            "k1": "1.8"
          }
        },
        "creation_date": "1647924292297",
        "analysis": {
          "filter": {
            "synonym": {
              "type": "synonym",
              "lenient": "true",
              "synonyms": [
                "NATION, COUNTRY, FLAG"
              ]
            }
          },
          "analyzer": {
            "synonym": {
              "filter": [
                "uppercase",
                "synonym"
              ],
              "tokenizer": "whitespace"
            }
          }
        },
        "number_of_replicas": "1",
        "version": {
          "created": "6080099"
        }
      }
    }
  }
}

Is there a way so I can make this property still case-insensitive. All the solution i've found only shows that I should only either set all the text inside nationality to be lowercase or uppercase. But how if I have uppercase & lowercase letters inside the index?

Do you mean you have updated index mapping to existing index ? if yes then did you reindex your data ?

Because i have crated same index mapping with same analyzer but not able to reproduce this issue. I think so once you reindex data, this issue will be resolved.

Ah i'm sorry, i forget to explain that. my_index is not an existing index. But I created it using this command. Thank you for your response.

PUT /my_index

{
  "settings": {
    "number_of_shards": 1,
    "analysis": {
        "filter": {
          "synonym": {
            "type": "synonym",
            "lenient": "true",
            "synonyms": [
              "NATION, COUNTRY, FLAG"
            ]
          }
        },
        "analyzer": {
          "synonym": {
            "filter": [
              "uppercase"	
              "synonym"
            ],
            "tokenizer": "whitespace"
          }
        }
      }
  },
  "mappings": {
    "properties": {
      .
      .
      "nationality": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            },
            "analyzer": "synonym"
          },
      .
      .
    }
  }
}


I have found the solution!

I didn't realize that the filter that I applied in the settings is applicable while updating and searching the data. At first, I did this step:

  1. Create index with synonym filter
  2. Insert data
  3. Add uppercase before synonym filter

By doing that, the uppercase filter is not applied to my data. What I should've done are:

  1. Create index with uppercase & synonym filter (pay attention to the order)
  2. Insert data
    Then the filter will be applied to my data.
1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.