Different Results Based on Aggregation Size


(Shane Daniel) #1

I'm getting different results depending on selected Size.

Steps to reproduce in Kibana:

  1. Create vertical bar chart
  2. metrics > Y-Axis > Aggregation: Average
  3. metrics > Y-Axis > Field: select a numerical field
  4. buckets > X-Axis > Aggregation: Terms
  5. buckets > X-Axis > Field: select a string field
  6. buckets > X-Axis > Order by: select metric: Average
  7. buckets > X-Axis > Order: Descending
  8. buckets > X-Axis > Size: 10

For my data the first bar has a value of 1,918.84. It is using a document count of 5 to calculate this result.

If I change buckets > X-Axis > Size: 30, the first bar has a value of 1,373.61. It is using a document count of 7 to calculate this result.

If I change buckets > X-Axis > Size: 0, the first bar has a value of 962.51. It is using a document count of 10 to calculate this result. I believe this to be the correct value.

There are 10 documents with the value associated with the first bar, and a total count of 13,902 in total document Hits.

I'm concerned as the returned values are quite inaccurate, and I cannot set Size: 0 as there are too many bars. Is there a recommended solution?


(Jim Unger) #2

Can you provide me with a sample of your data with which I can reproduce this problem?


(Shane Daniel) #3

Thanks for the response Jim. Unfortunately I can't share the particular data set I'm working with.

I tried to create a similar scenario using http://demo.elastic.co/packetbeat

Time filter: From: 2016-07-21 23:00:00.000 To: 2016-07-22 00:00:59.999

Create vertical bar chart (index: packetbeat-*)

metrics > Y-Axis > Aggregation: Average
metrics > Y-Axis > Field: responsetime
metrics > +Add metric
metrics > Y-Axis (2) > Aggregation: Count

buckets > X-Axis > Aggregation: Terms
buckets > X-Axis > Field: query
buckets > X-Axis > Order: Top
buckets > X-Axis > Size: 3
buckets > X-Axis > Order By: metric: Average responsetime

Here the first bar, test.users.find() has a count of 8 and an avg responsetime of 91.125.
Also the third bar, "GET /static/img/paper_fibers.png HTTP/1.1" has a count of 1 and an avg responsetime of 59.

Changing,
buckets > X-Axis > Size: 100

Now test.users.find() has a count of 9 and an avg responsetime of 84.667.
Also, "GET /static/img/paper_fibers.png HTTP/1.1" now has a count of 2 and an avg responsetime of 44.5


(Jim Unger) #4

@Shane_Daniel,

I agree that the results that you mentioned do sound suspect, and would like to investigate it further, but need to be able to reproduce the issue.

I tried to recreate what you described above. I assume it's because the demo data changed since you set this up. (see image to make sure I didn't screw anything up)


Confusion Regarding Unstable Bucket-ing in Kibana Visualizations
(Shane Daniel) #5

@BigFunger,

Yes, it looks like the data I used in the example has been purged.

I was able to resolve my issue by setting X-Axis > JSON Input > {"shard_size":0}

For other readers information the description of the shard_size parameter can be found here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_shard_size_2


(Jim Unger) #6

@Shane_Daniel,

Thanks for posting your solution! Wish I could have been more help.


(system) #7