as you see the first token starts with 2022 so it should be matched to the query with 2022. shouldn't it?
But the result of the query with 2022 doesn't include that file. Can someone explain the reason?
will return the expected result. Even 20 and 2 returns the expected result.
It would be helpful to provide more information - perhaps the mappings, more specific query, and version to start, and whether you still return that just further down the result set that you expect.
It's because of the max_expansions value. By default it's set to 50. So when you send query with longer characters (eg. "202201") you will see the results. You can increase the max_expansion value the but it can hurt the performance.
See my screenshot, both queries are showing the results because I have only one doc in test index.
Thank you for your answer! Increasing max_expansions to 100 resolved the problem. But do you have an idea why query with 2022 doesn't show the file 20220101_Legal Document_5678.pdf? From what I understood, the filename has the exact match with the prefix 2022, so it doesn't even need to expand the query.
Or Does ES expand the query to compare to the whole token (20220101_legal ) ?
Just want to answer to my question above. Match phrase prefix query | Elasticsearch Guide [8.14] | Elastic already explains how it works. so ES expands the phrase in the query with the suggestion it fuzzies. so it means we need 20220101_Legal to be generated with the expansion.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.