That Stack Overlow question is quite old. You can use a normalizer on a keyword field, that removes all but the first character. That way you won't have to enable fielddata.
Thanks @abdon , the solution you provide is working perfectly.
I am new to ElasticSearch so that it's quite confused once starting combining different criteria into one.
Hi @abdon: could you please give me advice as when I deal with Unicode characters, the 1st chars is not recognized, also I want to group the local characters into a default Latin based:
To replace all numbers 0-9 with a # you could use a second character filter.
The process of converting characters like É and Đ to to their ASCII equivalents E and D is called folding, which you can achieve with the ASCII Folding token filter.
Putting all of that together would result in something like this:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.