again sorry if this is newb. I am testing ES in an ecommerce scenario. The "AND" operator is naturally providing much more relevant results in most situations. BUT...
I'm going down through recent search terms used on the site and testing them in ES. The ones with several words are often not matching anything with the "AND" operator (especially when there's a grievous misspelling), even though there are one or more words that could match documents (if "OR" was used).
I am wondering , is it within the realm of possibility to trigger a second query on the terms using the "OR" operator if (and only if) a) the first "AND" query retuned 0 hits and b) there are 2+ words entered? I am thinking the secondary query could have it's own heading on the results page like "we couldn't find exact matches for your search, but here are some partial matches"
I mean, I know it's technically possible, I am just wondering what the best practice is.
This is a useful approach for you to build on top of the elasticsearch APIs and I know of at least one search consultancy that specializes in ecommerce deployments who regularly do this.
Most users don't understand Boolean logic AND vs OR etc and one benefit of opting for AND by default is that the aggregations/facets that summarise the matches tend to make more sense. In my view there's a fundamental usability concern if you wed very fuzzy matching capabilities with a faceting system that gives precise numbers - consider a search for
ice age where OR is the default operator - you might report 3 matches in the movie department and 27 in the electrical department (fridges that have an ice maker). Defaulting to AND helps makes these numbers more realistic and increases precision with some loss of recall.
Your idea of only resorting to OR when hits==0 is one strategy for considering the worst possible recall scenario - maybe you'd want to consider invoking that rule at a slightly higher threshold eg hits <=2.
We can be more scientific about this though. One intriguing idea is to perform some analysis of popular queries in advance to discover if the ORed version shows signs of ambiguity like my ice age fridge and movie example. If you compare the tags/departments/click patterns of the AND form with the OR form it can reveal that the OR form clearly has different interpretations. Consider this example using the elastic Graph API on StackOverflow data to show the difference in related article tags between
table AND width vs
table OR width : https://twitter.com/elasticmark/status/720934856750997504
Thanks for the insight, Mark. I'm gonna do a bunch more testing.