Weird behavior of 'query_string' query. Possible false positive result


(Lakomkin Egor) #1

I need to do phrase search, where I need several things:

  1. the order of the words does not matter
  2. I need to specify the possible gap between matched words
  3. I need to consider matches with stopwords inside the phrase

I have found that 'query_string' can deal with all of those.

The question I have is that if I have a document "three four world that awesome those hello five" and I would look for "hello world" I expect that this document won't be matched if I specify slop(or gap between matches) as 0.

I have attached script that reproduces this false positive. Script is here
Maybe I am doing something wrong?


(Zachary Tong) #2

The query_string syntax for phrases is to wrap the phrase in quotes. Your query is just looking for the terms individually (no phrase), which is why it's matching.

Try this instead:

"query": "\"hello world\"",

(Doug Turnbull) #3

Also curious why query_string would be what you wanted? query_string is usually good for users that have an expectation of searching with a traditional search syntax (title:dog AND body:cat).

Wolud a match_phrase query be more specific, carry fewer surprises, and be easier to tune?


(Zachary Tong) #4

++

I try to avoid query_string as much as possible myself, there are a lot of ways to trip yourself (and your users) up. Using the match family as much as possible tends to work better, as well as constructing more complicated queries yourself (e.g. bool combinations) rather than relying on the user to construct it in a single line.


(system) #5