I am using the standard stopwords file for my index. But I would like to NOT remove stopwords when they are within double quotes, since that is exactly what the user is searching for. For example, if someone searches "To Be Or Not To Be", that literally is all stopwords. Is there any way to tell elasticsearch to consider those words and not toss them out when searching?
Hey @ryans :
In order to search for the exact phrase, including the stop words, you can't remove the stop words from either the indexing or the searching side. In case you need to support that kind of queries, you should not use the stopwords token filter.
Stop words will not affect the score that much, as they are present in nearly all documents with a high frequency. And queries like match
allow to efficiently skip common terms dynamically without further configuration.
Thanks for the feedback. I see the following from your link:
This option can be omitted as the [Match] can skip blocks of documents efficiently, without any configuration, provided that the total number of hits is not tracked.
We do use track_total_hits for some of our queries. So does that mean we get no automatic benefit of this when using multi_match?
So does that mean we get no automatic benefit of this when using multi_match?
With track_total_hits
, all hits need to be accounted for, independently of the scoring. So it will always have a performance impact, and in this case all matches that contain stopwords will need to be accounted for - that takes time, and match
won't be able to use optimizations to end the search earlier.