Discover Tab filters data incorrectly


(Pavithra K C) #1

Hello all,

I am trying to filter data based on TRADENUMBER in my discover tab. It shows incorrect results. It retrieves data for other Tradenumber's too.

I did a query on elasticsearch too. I get same results which is actually wrong.

TRADENUMBER - number
Version - docker.elastic.co/kibana/kibana-oss:6.2.2

I do have same setup in different version of elasticsearch and kibana instance running on docker.
Version: docker.elastic.co/kibana/kibana:5.6.2

In this one, if I do the same query I get perfect results (1 hit).

Please the see below screenshot for the same.

It looks very strange for me.

Other differences between these two instances are:

Recent version 6.2.2 use time-based indices and other one 5.6.2 use normal index.

Please advise.


(Lee Drengenberg) #2

It looks like your filter is a string like "615,160,670", which would be analyzed and split by the commas into 3 terms. So your results are any TRADENUMBER which contains "615" or "160" or "670".

Is your TRADENUMBER a number type (when you look at the index pattern in the Management tab?


(Lee Drengenberg) #3

I could be wrong there. When I add a filter by clicking on some percentage number value I get this, which appears to be searching for a string;

image

But if I edit the filter I get the decimal number;

And one step further to Edit Query DSL I see this;

Can you check your filter and see if it's really doing the numeric query?


(Pavithra K C) #4

Hello LeeDr,

Please find the attached snapshot of it.


(Lee Drengenberg) #5

The index in Elasticsearch doesn't know if it's time-based or not. It's just data. It's Kibana that let's you decide if you specify a time field when you create the index pattern. If you select a time field for the index pattern then Discover can show the Date Histogram.

On your 6.2.2 instance, can you disable that filter and use the query bar to query TRADENUMBER:615160670 and tell us if that gives the correct results?


(Pavithra K C) #6

I removed time filter and selected index of this month. It is not yielding correct results. Please see the below screenshot.


(Matt Bargar) #7

I suspect TRADENUMBER may be mapped incorrectly in some indices. Could you share the output of these requests:

GET _field_caps?fields=TRADENUMBER

GET twofour_volume-*/_mapping/_doc/field/TRADENUMBER


(Pavithra K C) #8

Hello Matt,

Please find the attached output for the same:

  1. GET _field_caps?fields=TRADENUMBER
    {
    "fields": {
    "TRADENUMBER": {
    "float": {
    "type": "float",
    "searchable": true,
    "aggregatable": true,
    "indices": [
    "twofour_markup-2014.04",
    "twofour_markup-2014.05",
    "twofour_markup-2014.06",
    "twofour_markup-2014.07",
    "twofour_markup-2014.08",
    "twofour_markup-2014.09",
    "twofour_markup-2014.10",
    "twofour_markup-2014.11",
    "twofour_markup-2014.12",
    "twofour_markup-2015.01",
    "twofour_markup-2015.02",
    "twofour_markup-2015.03",
    "twofour_markup-2015.04",
    "twofour_markup-2015.05",
    "twofour_markup-2015.06",
    "twofour_markup-2015.07",
    "twofour_markup-2015.08",
    "twofour_markup-2015.09",
    "twofour_markup-2015.10",
    "twofour_markup-2015.11",
    "twofour_markup-2015.12",
    "twofour_markup-2016.01",
    "twofour_markup-2016.02",
    "twofour_markup-2016.03",
    "twofour_markup-2016.04",
    "twofour_markup-2016.05",
    "twofour_markup-2016.06",
    "twofour_markup-2016.07",
    "twofour_markup-2016.08",
    "twofour_markup-2016.09",
    "twofour_markup-2016.10",
    "twofour_markup-2016.11",
    "twofour_markup-2016.12",
    "twofour_markup-2017.01",
    "twofour_markup-2017.02",
    "twofour_markup-2017.03",
    "twofour_markup-2017.04",
    "twofour_markup-2017.05",
    "twofour_markup-2017.06",
    "twofour_markup-2017.07",
    "twofour_markup-2017.08",
    "twofour_markup-2017.09",
    "twofour_markup-2017.10",
    "twofour_markup-2017.11",
    "twofour_markup-2017.12",
    "twofour_markup-2018.01",
    "twofour_markup-2018.02",
    "twofour_markup-2018.03",
    "twofour_markup-2018.04",
    "twofour_markup-2018.05",
    "twofour_volume-2014.04",
    "twofour_volume-2014.05",
    "twofour_volume-2014.06",
    "twofour_volume-2014.07",
    "twofour_volume-2014.08",
    "twofour_volume-2014.09",
    "twofour_volume-2014.10",
    "twofour_volume-2014.11",
    "twofour_volume-2014.12",
    "twofour_volume-2015.01",
    "twofour_volume-2015.02",
    "twofour_volume-2015.03",
    "twofour_volume-2015.04",
    "twofour_volume-2015.05",
    "twofour_volume-2015.06",
    "twofour_volume-2015.07",
    "twofour_volume-2015.08",
    "twofour_volume-2015.09",
    "twofour_volume-2015.10",
    "twofour_volume-2015.11",
    "twofour_volume-2015.12",
    "twofour_volume-2016.01",
    "twofour_volume-2016.02",
    "twofour_volume-2016.03",
    "twofour_volume-2016.04",
    "twofour_volume-2016.05",
    "twofour_volume-2016.06",
    "twofour_volume-2016.07",
    "twofour_volume-2016.08",
    "twofour_volume-2016.09",
    "twofour_volume-2016.10",
    "twofour_volume-2016.11",
    "twofour_volume-2016.12",
    "twofour_volume-2017.01",
    "twofour_volume-2017.02",
    "twofour_volume-2017.03",
    "twofour_volume-2017.04",
    "twofour_volume-2017.05",
    "twofour_volume-2017.06",
    "twofour_volume-2017.07",
    "twofour_volume-2017.08",
    "twofour_volume-2017.09",
    "twofour_volume-2017.10",
    "twofour_volume-2017.11",
    "twofour_volume-2017.12",
    "twofour_volume-2018.01",
    "twofour_volume-2018.02",
    "twofour_volume-2018.03",
    "twofour_volume-2018.04",
    "twofour_volume-2018.05"
    ]
    },
    "long": {
    "type": "long",
    "searchable": true,
    "aggregatable": true,
    "indices": [
    "twofour_trades-2017.08",
    "twofour_trades-2017.09",
    "twofour_trades-2018.01",
    "twofour_trades-2018.04",
    "twofour_trades-2018.05"
    ]
    }
    }
    }
    }

  2. GET twofour_volume-*/_mapping/_doc/field/TRADENUMBER

{}

It is empty.
I did the following query.
GET twofour_volume-*/_mapping/field/TRADENUMBER

https://gist.github.com/pavithrachandrakasu/306330b8f6377cacb5841ab8d5720a3a

(Matt Bargar) #9

Hmmm, I don't see anything out of the ordinary there. Could you try the Explain API on one of the incorrectly matching docs with the query from the filter?


(Pavithra K C) #10

I do not find an option like _doc inside. Please the below screenshot:

image


(Pavithra K C) #11

I just did something like below to find out how to use explain parameter:

From above query, I took "_type", "_id" in the below query to find out.

I am not sure whether it is right or not. I am just sending it to you. I see that the details part is completely empty.


(Matt Bargar) #12

The IDs of the docs in those two screenshots are different. Could you post the search hit for the doc that you're doing the _explain on in the second screenshot? Also, in your search request, please request the doc_values for the TRADENUMBER field. You can do that like this:

GET /<index-name>/_search
{
    "query" : {
        "term": {
          "TRADENUMBER": 615160640
        }
    },
    "docvalue_fields" : ["TRADENUMBER"]
}

(Pavithra K C) #13

Sorry. I used a wrong ID. Now please refer to the below image:

I think the results are same.

Please find the results of other query:


(Matt Bargar) #14

Hi @pavithrakc, I checked with the ES team and it seems I missed the obvious. As you can see in your last screenshot, the value in _source and the doc_values (same as the indexed value you're searching on) are different. Since TRADENUMBER is mapped as a float in this index this is likely a floating point rounding issue. The largest whole integer a float can store without loss of precision is 16,777,217, and your value is larger than that.

You can fix this by mapping the field as a different type. You have a few options with different tradeoffs.

If you don't actually need these to be floating point numbers you could map them as keyword or long. Keywords will be faster if you tend to do single point lookups on this field, longs will be faster if you tend to do more range queries.

If you do need these to be floating point numbers, you could change it to a double which supports larger numbers, or a scaled float which gives you control over the level of precision needed.

You can read more about the numerical datatypes available in ES here.


(Lee Drengenberg) #15

Now I see that the clue I missed from the very first screenshot is that all the TRADENUMBER values shown by that Discover filter are within a few numbers of each other. If you round them down to 7 or 8 digits they are all the same. Indicating the loss of precision.


(system) #16

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.