I'm working on an Elasticsearch application for searching person data, including names in both English and French. I've completed the data indexing process.
Current Issue:
Case Sensitivity: Searching for "Francois" doesn't match documents containing "françois" or "François."
Special Characters: Names with accents (e.g., François) don't match searches without them (e.g., francois).
Removing ASCII Characters: My current approach of removing ASCII characters during search hinders accurate matching for French names.
Desired Outcome:
I want to achieve case-insensitive and special character-insensitive matching for French names in my Elasticsearch search. This means:
Searching for "francois" should match documents containing "Francois," "françois," and potentially variations like "francois" (depending on the approach).
Accents and other relevant special characters in French names should not affect search results.
This one not cover my expected results ->
If the input is francois then I want to match the records which contains françois as an exact match (right now its working with fuzziness).
Is there anything I am missing in my analyzers?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.