Compare accuracy between rollup and normal indices

cmallios · February 13, 2019, 11:37am

I am trying to run some reports to compare rollup vs normal indices, but I am getting different results.

At first I am using the version 6.5 and my config is:
delay: 1d
rollupindex: test-index
index pattern: $productName-$dataset-201806
datehistogram:{field:timestamp, Interval: 24h, timezone: utc}
terms:[response.raw, itemTitle.raw, groupTitle.raw, seriesTitle.raw]

and my rollup query is:
GET test-index/_rollup_search
{
"size": 0,
"aggregations": {
"seriesTitle": {
"terms": {
"field": "seriesTitle.raw",
"size": 3,
"collect_mode": "breadth_first",
"order": {
"_count": "desc"
}
},
"aggregations": {
"groupTitle": {
"terms": {
"field": "groupTitle.raw",
"size": 3,
"order": {
"_count": "desc"
},
"collect_mode": "breadth_first"
},
"aggregations": {
"itemTitle": {
"terms": {
"field": "itemTitle.raw",
"missing": "[No item title]",
"size": 3,
"collect_mode": "breadth_first",
"order": {
"_count": "desc"
}
}
}
}
}
}
}
}
}

If apply the same exact query against the normal index using the _search endpoint, I am getting a different result(the noisy is extremely noticeable)

Regarding the rollup, I thought that we could use any SUBSET of the terms fields set, in any order and at the same time I have seen no restriction until now that we should MANDATORY use it (either in the first level or in any other).
Have I undeRstood something wrong?

cmallios · February 14, 2019, 5:05pm

Below is a subpart of a response query via the rollup endpoint, where I have just replaced the keys with a random value. The sum between the childs' doc_count and the sum_other_doc_count is by far not consistent with parent's doc_count and at the same time the parent's doc_count is consistent with the expected count.

{
"key_as_string": "2018-06-01T00:00:00.000Z",
"key": 1527811200000,
"doc_count": 1786083,
"seriesDoi": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 554401,
"buckets": [
{
"key": "key1",
"doc_count": 5
},
{
"key": "key2",
"doc_count": 5
},
{
"key": "key3",
"doc_count": 4
},
{
"key": "key4",
"doc_count": 2
},
{
"key": "key5",
"doc_count": 2
},
{
"key": "key6",
"doc_count": 2
},
{
"key": "key7",
"doc_count": 2
},
{
"key": "key8",
"doc_count": 2
},
{
"key": "key9",
"doc_count": 1
},
{
"key": "key10",
"doc_count": 1
}
]
}
}

system · March 14, 2019, 5:05pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Rollup - Sum Bucket Aggregation results are buggy Elasticsearch	13	1088	October 1, 2018
Rollup Problem Elasticsearch	3	393	March 17, 2019
Roll up index does not populate Time field Kibana	4	968	October 29, 2020
Problem while running multiple rollup jobs on same rollup-index Elasticsearch	1	396	August 19, 2019
Rollup Aggregation Mismatch in results Elasticsearch	1	381	December 3, 2019

Compare accuracy between rollup and normal indices

Related topics