Hello, I'm updating my project from ES 2.3 to ES 6.1. I feel confused about the algorithm of QueryString with default_operator
and fields
.
My purpose is to search in the fields city.name
and zipcode.raw
with one query like New York 10001
. But with the same query, the result is different between 2.3 and 6.1. So I used _validate API
to debug.
Here's my debug in ES 2.3:
curl -XGET 'localhost:9200/myindex/_validate/query?explain&pretty' -d'
{
"query": {
"bool": {
"must": [{
"query_string": {
"query": "New York 10001",
"fields": ["city.name", "zipcode.raw"],
"default_operator": "AND"
}
}]
}
}
}'
I got the following result (which is what I want , and which is understood according to the doc):
"explanation" : "filtered(+(+(city.name:new | zipcode.raw:New) +(city.name:york | zipcode.raw:York) +(city.name:10001 | zipcode.raw:10001)))->cache(org.elasticsearch.index.search.nested.NonNestedDocsFilter@e17eb713)"
Now I use the same query in 6.3:
curl -XGET 'localhost:19200/buwox-index/_validate/query?explain=true&pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must": [{
"query_string": {
"query": "New York 10001",
"fields": ["city.name", "zipcode.raw"],
"default_operator": "AND"
}
}]
}
}
}'
I got the following result:
"explanation" : "+(+((+city.name:new +city.name:york +city.name:10001) | zipcode.raw:New York 10001)) #DocValuesFieldExistsQuery [field=_primary_term]"
So after several tests, I found that in order to get what I want, I should use the query like this:
curl -XGET 'localhost:19200/buwox-index/_validate/query?explain=true&pretty' -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must": [{
"query_string": {
"query": "New AND York AND 10001",
"fields": ["city.name", "zipcode.raw"],
"default_operator": "OR"
}
}]
}
}
}'
With this query I can get the same result as 2.3:
"explanation" : "+(+(+(city.name:new | zipcode.raw:New) +(city.name:york | zipcode.raw:York) +(city.name:10001 | zipcode.raw:10001))) #DocValuesFieldExistsQuery [field=_primary_term]"
So here comes my question, according to the doc QueryString, if there is no explicit operator, the query will use what I defined in default_operator
. Right?
And with multi field
, the relation between the fields is OR clause
.
So why couldn't I use no explicit operator with default_operator = AND
to get the purpose? It seems that the default_operator
changes also the combination between the multi field
, if I don't use explicit operator.
I don't think it's related to my mapping. But just in case, I post also my mapping in 2.3 and 6.1.
My mapping in 2.3:
...
"zipcode": {
"type": "integer",
"index": "not_analyzed",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
},
"city": {
"properties": {
"name": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
},
"slug": {
"type": "string",
"index": "not_analyzed",
"include_in_all": false
}
}
}
...
My mapping in 6.1:
...
"analysis": {
"analyzer": {
"default": {
"type": "custom",
"tokenizer": "standard",
"filter": [ "asciifolding", "lowercase", "geowords" ]
}
},
"filter": {
"geowords": {
"type" : "word_delimiter_graph",
"split_on_case_change" : false,
"preserve_original" : false
}
}
}
...
"zipcode": {
"type": "integer",
"index": true,
"fields" : { "raw" : { "type" : "keyword", "index" : true } }
},
"city": {
"properties" : {
"name" : { "type": "text", "fields" : { "raw" : { "type" : "keyword", "index" : true } } },
"slug" : { "type": "keyword", "index": true }
}
}
...
Can someone help? Thanks.