Issue with explain on numeric fields


(Morus Walter) #1

Hi,

we are making use of explain in order to understand the scoring of complex queries.

When trying to upgrade from 1.5.2 to 1.7.2 I noticed a regression regarding the explanation of hits on fields of type 'short' (I suspect other numeric fields might have the same issue).
I checked 1.6.0 and it has the same issue.

While I had entries like
...
"description"=>"ConstantScore(field:[13 TO 13]), product of:
...
this changed into
"description"=>
"ConstantScore(field:`\b\u0000\u0000\u0000\r), product of:"

(not sure if both samples are about the same field value)
so it's quite hard to see which value was matched.

The field is of type 'short' and the query part creating this hit
is something like
{
"terms": {
"field": [
"14",
"13"
],
"boost": 10,
"minimum_should_match": 1,
"disable_coord": true
}
},

Is there anything I should/could do differently to avoid the issue or is this just a bug?

Not sure if it is on ES side or Lucene but ES 1.5.2, 1.6.0 and 1.7.2 seem to be all based on the same lucene version (4.10.4)

best
Morus


(Mark Harwood) #2

This was the source of the change in behaviour: https://github.com/elastic/elasticsearch/issues/10646

This was changed to improve query execution speed (using a single TermQuery instead of an unnecessary NumericRangeQuery) and a side-effect of this is that the "explain" syntax is less readable.
Explain is a low-level breakdown of query execution at the Lucene index level and so does not always fully reflect the syntactical sugar that higher-level APIs provide e.g. elasticsearch date range expressions of "now -1d". It is describing the rewritten form of the query that actually executes on Lucene indexes. This may include byte representations that are actually stored in the index but bear little resemblance to the original user search criteria.


(Morus Walter) #3

Hi Mark,

I see.

Thanks for your explanation. Of course efficient query execution is more relevant than nice looking explains.

For my limited use cases (0..16000) it turned out to be not too difficult to decode the bytes into readable numbers, so I'll do that.

merci
Morus


(system) #4