Default_operator 'AND' with query_string does not work properly

I have some issues trying to make default_operator AND work with query_string on multiple fields. (Elastic Search v 7.2)

Here is my query and the document indexed :

POST testeur/_doc/document
{
  "content": "attention dans la revision",
  "main_title" : "fichier dans la voiture"
}

GET testeur/_search
    {
      "query": {
        "query_string": {
          "query": "attention voiture",
          "default_operator": "AND",
          "fields": ["content","main_title"]
        }
      }
    }

I have no results when launching the request above.

The document is retrieved with the request below (specifying the operator AND directly in the query field) though :

GET testeur/_search
{
  "query": {
    "query_string": {
      "query": "attention AND voiture",
      "fields": ["content","main_title"]
    }
  }
}

I guess I am doing something wrong with the default_operator field. Unfortunalety, I have no clue what's wrong despite reading the query_string documentation.

Especially when reading this part about default_operator in query_string documentation :

AND
For example, a query string of capital of Hungary is interpreted as capital
AND of AND Hungary.

That's exactly what I intend to do.

Could you provide any help please ?

Interesting.

@jpountz WDYT? The behavior does not look consistent indeed. I guess that behind the scene we do something like: content:(attention AND voiture) main_title:(attention AND voiture), right?

Thank you for your reply. As I am stil stuck with that problem, I allow myself to post this message to get more assistance.

I've got some information on the internet. Here is the link if someone encounters the same issue : Link

The way Elastic handles default_operator AND in multiple fields query_string request does not seem correct with regard to the doc.

Hello,
sorry about the wait.

I have given this a try and used the validate query API with explain option activated:

GET http://localhost:9200/testeur/_validate/query?q=attention%20voiture&default_operator=AND&explain=true

{
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "valid": true,
  "explanations": [
    {
      "index": "testeur",
      "valid": true,
      "explanation": "(content.keyword:attention voiture | main_title.keyword:attention voiture | (+content:attention +content:voiture) | (+main_title:attention +main_title:voiture))"
    }
  ]
}

GET http://localhost:9200/testeur/_validate/query?q=attention%20AND%20voiture&explain=true

{
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "valid": true,
  "explanations": [
    {
      "index": "testeur",
      "valid": true,
      "explanation": "+(content.keyword:attention | main_title.keyword:attention | content:attention | main_title:attention) +(content.keyword:voiture | main_title.keyword:voiture | content:voiture | main_title:voiture)"
    }
  ]
}

This explains the difference in the outcome from the two queries. This is very subtle, and I don't even know which one of the two is the right behaviour. I guess it's debatable and depends what users expect. Note that AND or OR operator is generally not enough to describe how you want the operator to be applied when searching for multiple terms against multiple fields, especially when using AND. Do you mean that all terms need to appear on all fields? Or that at least one of the terms has to appear in all the fields? Or that all the terms need to appear in at least one of the fields? This is just more complicated than boolean logic. Unless you need all the power of the query_string query, I would recommend to look at the match query which offers options to fine-tune this behaviour.

Cheers
Luca

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.