Default match_all behavior for match query with no tokens after analysis

John_Daniels · November 21, 2012, 4:37pm

Hi all!

Currently I am working on an application which allows users to search on
multiple different fields using our own query language. This sometimes
requires us to combine a search for analyzed text with another query using
an "and" operator. For example, someone could search for:
some text AND color:green

We normally do this by combining a match query with another query for the
color using a boolean query. This is fine, but if "some text" consists of
only stop words, then we will get no results. For example, if someone
searches for:
is AND color:green
then we will get no results. While we can't do anything useful with a term
that only contains stop words, we would rather turn it into a match_all
rather than a match_none. Currently, a match query with only stop words
yields a Lucene BooleanQuery with no terms, which will never match any
documents. In an ideal world for us, we would like a query that when it
receives no tokens from the analyzer yields a Lucene MatchAllDocs query.
This can for example be achieved by running a field query with the
following query text:
+({possible stopword text here}) *
Unfortunately, this seems like somewhat of a hack and I'd rather not
construct Lucene query strings that are just going to be immediately parsed
if I can avoid it. I was wondering if there is some better way to get
similar semantics by directly using ElasticSearch queries. It's not a super
big deal but it would be nice to be able to implement behavior similar to
what Lucene query strings already do, but with different semantics for the
application.

So does anyone have any recommendations for making such a query without
using a Lucene query string? Or is that the best way to do this?

Thanks!

--

Chris_Male · November 21, 2012, 10:09pm

Hi John,

This problem comes up quite a lot but unfortunately there isn't many
options at this moment about what you can do. MatchQuery (the Query
produced by the 'match' query type) currently has hardcoded behaviour for
what to do when analysis strips all the terms from the input.

However I have
opened Cannot change MatchQuery behaviour with 0 terms · Issue #2429 · elastic/elasticsearch · GitHub to change
this so you can provide a flag of what to do in this situation.

On Thursday, November 22, 2012 5:37:48 AM UTC+13, John Daniels wrote:

Hi all!

Currently I am working on an application which allows users to search on
multiple different fields using our own query language. This sometimes
requires us to combine a search for analyzed text with another query using
an "and" operator. For example, someone could search for:
some text AND color:green

We normally do this by combining a match query with another query for the
color using a boolean query. This is fine, but if "some text" consists of
only stop words, then we will get no results. For example, if someone
searches for:
is AND color:green
then we will get no results. While we can't do anything useful with a term
that only contains stop words, we would rather turn it into a match_all
rather than a match_none. Currently, a match query with only stop words
yields a Lucene BooleanQuery with no terms, which will never match any
documents. In an ideal world for us, we would like a query that when it
receives no tokens from the analyzer yields a Lucene MatchAllDocs query.
This can for example be achieved by running a field query with the
following query text:
+({possible stopword text here}) *
Unfortunately, this seems like somewhat of a hack and I'd rather not
construct Lucene query strings that are just going to be immediately parsed
if I can avoid it. I was wondering if there is some better way to get
similar semantics by directly using Elasticsearch queries. It's not a super
big deal but it would be nice to be able to implement behavior similar to
what Lucene query strings already do, but with different semantics for the
application.

So does anyone have any recommendations for making such a query without
using a Lucene query string? Or is that the best way to do this?

Thanks!

--

sulemanmubarik · February 15, 2013, 8:57pm

Hi
i have a question why ZeroTermsQuery is added only in MatchQuery why it is not added in QueryStringQuery or in MultiMatchQuery.
i am using MultiMatchQuery and i have the same problem. i can change it to MatchQuery because i have to add multiparty fields .or is there a way how i can add multiparty fields with MatchQuery
thanks

Topic		Replies	Views
Lucene syntax for match all docs Elasticsearch	9	3283	July 6, 2017
Term negation and fuzziness Elasticsearch	2	495	September 25, 2023
Search fields without Stopwords Elasticsearch	4	543	March 1, 2018
Inconsistency with query_string results Elasticsearch	3	1080	July 6, 2017
(Newbie) Differences between text and field/query_string, and matching words vs phrases Elasticsearch	6	707	July 6, 2017

Default match_all behavior for match query with no tokens after analysis

Related topics