I'm confused why the match query seen below is matching two documents rather than just one. I thought using the "and" operator would require all terms to be present in order for it to match.
When I hit the explain endpoint (GET people/_explain/2) with the id of the document I do not expect to be there, I see the description mentioning synonyms, which seems unexpected to me.
weight(Synonym(email:john email:john.smith email:smith) in 1) [PerFieldSimilarity]
Why is tom.smith@gmail.com showing up in the results?
Well, I found some documentation as to why it's doing this, but not sure what the best way forward is. I'd like it to behave as if the "and" operator works like normally.
Note: All tokens are emitted in the same position, and with the same character offsets. This means, for example, that a match query for john-smith_123@foo-bar.com that uses this analyzer will return documents containing any of these tokens, even when using the and operator. Also, when combined with highlighting, the whole original token will be highlighted, not just the matching subset. For instance, querying the above email address for "smith" would highlight:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.