Inconsistency with query_string results


#1

I encountered some strange behavior of the query_string query:

I have an index with one type "objects" that contains one document with a default-analyzed string field called "content" that is empty. If I search with

GET /dbc_xyz/objects/_search
{
  "version": true,
  "query": {
    "query_string": {
      "default_field": "content",
      "default_operator": "AND",
      "query": "Term"
    }
  }
}

Elasticsearch as expected finds nothing. And if I change the query string to "-Term" it finds the empty document, also as I expected. However, when I tried to find out whether "-" takes precedence over OR, as is usually the case, I found this: "-Term OR Termtwo" returned nothing. If I search with "(-Term) OR Termtwo" again the empty document is found. So, I asked myself whether quite unusually "OR" takes precedence over "-". But when I tried "-(Term OR Termtwo)" the empty document was again returned. Only the "-Term OR Termtwo" tried first returned nothing.

I cannot make sense of this, is this possibly a bug? I use Elasticsearch 1.5.2

Best
Heiko


(Ivan Brusic) #2

If a clause only contains a negated term, then Elasticsearch will
implicitly create a match all query and the negated term (content:*
content:-Term) since Lucene does not handle purely negative clauses. The
match all trick has always existed in Lucene, Elasticsearch just does it in
the background.

I suspect the issue is with what Elasticsearch considers a purely negative
clause in its parser. Never checked. I would suggest always adding the
match all (: or fieldname:*) when using only negation explicitly to avoid
any ambiguity.

Cheers,

Ivan


#3

Hi, Ivan,

thanks for your reply. However, this is not the situation I'm concerned about, as "-Term" gives me exactly the result I expect. And really, the workaround you cited is logically equivalent and would not explain differences in results. What I'm really confused about is when the clause does not only contain a negated term. For example, if "-Term" yields one (of one in total) document, I would expect that, no matter what I "OR" with "-Term", I will get the same result. But if I use "-Term OR Termtwo" it yields an empty result - this doesn't make sense. So I wondered which of the two operators ("-", i.e. negation, and "OR") takes precedence over which, although I couldn't immediately see how this could make a difference. To my surprise, both possibilities, "(-Term) OR Termtwo" and (the most unusual) "-(Term OR Termtwo)" yielded the same single document. For me this looks like a bug, as from the logic of it, it's wrong.

Cheers
Heiko


(system) #4