I am using Elastic Search to index a lot of data in Romanian,
containing specific characters in UTF-8 encoding: ș, ț, î, ă, â, Ă, Â,
Ț, Ș, Î.
The indexing works fine and so does the searching - I'm using PHP with
the elastica client.
Now I'm trying to some searches and I want Elastic Search to match the
text with both diacritics and without them.
Let me give you an example: I'm searching for "Bucuresti" (which is
Romanian for Bucharest) and currently Elastic Search is correctly
returning the results. However I would like to also get the results
for "București" (which is the correct form containing the diacritics)
when I search for "Bucuresti".
For example, try to search for "Bucuresti" with Google - it will
return results for both "Bucuresti" and "București".
I am using Elastic Search to index a lot of data in Romanian,
containing specific characters in UTF-8 encoding: ș, ț, î, ă, â, Ă, Â,
Ț, Ș, Î.
The indexing works fine and so does the searching - I'm using PHP with
the elastica client.
Now I'm trying to some searches and I want Elastic Search to match the
text with both diacritics and without them.
Let me give you an example: I'm searching for "Bucuresti" (which is
Romanian for Bucharest) and currently Elastic Search is correctly
returning the results. However I would like to also get the results
for "București" (which is the correct form containing the diacritics)
when I search for "Bucuresti".
For example, try to search for "Bucuresti" with Google - it will
return results for both "Bucuresti" and "București".
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.