health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open dblog-2017.01.19 Lis_GE3nTVutbUZMJYXDZw 10 1 1976053299 4 5.3tb 2.6tb
Problem Description:
When searching on a keyword field traceId with TermQuery, for example:
> POST /dblog-2017.01.19/_search
> {
> "query": {
> "term": {
> "traceId": {
> "value": "6226230315557965000"
> }
> }
> },
> "_source": "traceId"
> }
It returned no hits.
If using prefix query (& wildcard query) , the doc can be matched when the value queried is less than or equal to 15 characters like below:
If I index only a few docs to a test index, the same TermQuery matches the doc correctly.
I am not sure if this is a bug or constrain of Elasticsearch/Lucene. As the traceID field is of very high cardinality, and the index is with almost 2 Billions of docs, they looked like contributed to the problem.
ES/Lucene should support very high cardinality fields just fine, so this sounds like a possible bug.
Are many of your id values affected, or just a small subset?
Can you figure out which shard this document was routed to and zip up that entire Lucene index and post somewhere? I can pull it down and try to dig into it.
High cardinality and docs count are red herring. After probing a bit more today, it looked the json parser used by elasticsearch rounded a json long value when converting it to a string.
Below screen shot illustrates what is the problem. The traceId in _source is already rounded. This's why the attempt to filter by traceId from source always failed. However the terms aggregation still shows the correct key.
I tested ES 5.1.2 and the problem remains. Please be aware traceId from source json is not a string but of type number, ie, { "traceId": 1026314602185330712 } . It looks when parsing this long number, the precision is lost.
I just used curl for the same request and can confirm the result is good.
Previous tests I did are under Dev console (sense) within Kibana. So this now looks to be a problem with Kibana (or the js library it used for converting json long number).
Ahh, I see ... so something in Kibana's dev console is maybe truncating the long value. I assume a workaround here is for you to make this long value a string instead?
Can you open a Kibana issue, linking to this discussion? Thanks.
Yes, I've already make this traceId a string to get around the problem. But I think Kibana needs a way to get it fixed as this not only affect Dev console but all other functionality when a long value is used for filtering and graphing. I'll raise this issue to Kibana team via Github.
Thanks very much for helping me finding the root cause!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.