Elasticsearch can't hanlde space after add analyzer

I created an index called my_index by this command

{
    "settings": {
        "number_of_shards": 1,
        "analysis": {
            "filter": {
                "synonym": {
                    "type": "synonym",
                    "lenient": "true",
                    "synonyms": [
                        ...
                        ...
                        ...
                    ]
                }
            },
            "analyzer": {
                "synonym": {
                    "filter": [
                        "uppercase",
                        "synonym"
                    ],
                    "tokenizer": "whitespace"
                }
            }
        }
    },
    "mappings": {
        "items": {
            "properties": {
                "country": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                },
                "information": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    },
                    "analyzer": "synonym"
                },
                "person": {
                    "type": "text",
                    "fields": {
                        "keyword": {
                            "type": "keyword",
                            "ignore_above": 256
                        }
                    }
                }
            }
        }
    }
}

inside information, I had a data that looks like this 100 /INDIA/2022 (pay attention to the space after 100). If i search for 100/INDIA/2022 (no space after 100), Elasticsearch will return nothing. If I create new index with no analyzer, 100/INDIA/2022 will return the expected result. Can someone help me for this problem?

That's indeed because of the analyzer used.
In the first example, your analyzer is using a whitespace tokenizer at index time and search time.

The second example uses a standard tokenizer.

Use the _analyze API to understand how your text is transformed at index time and at search time.

If you can't find the same tokens, that explains why the search does not give any result.

1 Like

I see. It seems that change the tokenizer into standard solved the problem. Thank you

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.