Hello,
I have enabled a "lowercase" analyzer across all my indices, but I have run into an error while using it.
My query parameter:
"query": {
"query_string": {
"query": "username.keyword:\"Test\"",
"analyzer": "case_insensitive_analyzer"
}
My expected result: This should return all documents matching "Test" regardless of case permutations. e.g "test" and "TEST"
Actual result: It returns all documents matching only "test", in lowercase.
Some troubleshooting suggests to me, that the analyzer works correctly behind-the-scenes - that is, it's correctly searching in lowercase, regardless of what case I type it in.
However, since I indexed my data prior to adding this analyzer, I assume Elasticsearch does not recognize existing results with the value "TEST" as the same as the lower-cased "test", and they perhaps need to be re-indexed?
The analysis using GET _all/_settings returns this for all indices:
"analysis": {
"analyzer": {
"case_insensitive_analyzer": {
"filter": [
"lowercase"
],
"type": "custom",
"tokenizer": "standard"
},
"default": {
"filter": [
"lowercase"
],
"type": "custom",
"tokenizer": "keyword"
}
}
},
Please inform me if I've done something incorrect here - I am new to analyzers.
Otherwise, if correct, is there a way to "fix" my existing indices/documents to match this case-insensitive mapping?
I am importing large log files programmatically using an import script, so ideally I don't want to have to manually re-index everything.
I am running Elasticsearch 8.11.1
Edit: This analyzer also doesn't work at all when the query contains a space (it returns no results) nor punctuation (it ignores the punctuation entirely). I haven't the foggiest idea why that would be the case.
Thank you
Matthias