Hi i'd like to get doc_count when i did terms aggregation. here is example i used.
PUT sample
{
"mappings": {
"properties": {
"updateType": {"type": "keyword"},
"date": {"type": "date", "format": "epoch_millis"}
}
}
}
POST _bulk
{"index":{"_index":"sample", "_id":"1"}}
{"updateType":"insert", "date": 1664755200000}
{"index":{"_index":"sample", "_id":"2"}}
{"updateType":"insert", "date": 1664755201000}
{"index":{"_index":"sample", "_id":"3"}}
{"updateType":"update", "date": 1664755202000}
In this case, when i did terms aggregation to updateType with raw data, result was
GET sample/_search
{
"size": 0,
"aggs": {
"types": {
"terms": {
"field": "updateType",
"size": 10
}
}
}
}
"aggregations" : {
"types" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "insert",
"doc_count" : 2
},
{
"key" : "update",
"doc_count" : 1
}
]
}
}
and with 1d interval date histogram, result was the same.
GET sample/_search
{
"size": 0,
"aggs": {
"times": {
"date_histogram": {
"field": "date",
"interval": "1d"
},
"aggs": {
"types": {
"terms": {
"field": "updateType",
"size": 10
}
}
}
}
}
}
"aggregations" : {
"times" : {
"buckets" : [
{
"key_as_string" : "1664755200000",
"key" : 1664755200000,
"doc_count" : 3,
"types" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "insert",
"doc_count" : 2
},
{
"key" : "update",
"doc_count" : 1
}
]
}
}
]
}
}
this is fine so far. problem is when i roll up this index with 'updateType' terms aggregation with 1day interval and search it, result was like
"aggregations" : {
"types" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "insert",
"doc_count" : 1
},
{
"key" : "update",
"doc_count" : 1
}
]
}
}
This is because there's 'insert' and 'updateType' documents during 1day interval. and it counts 1 document.
However, I want the number!! The doc_count !! That's the important value to me.
I searched a lot. And get a good alternative method.
"dimensions": [
{
"date_histogram": {
"source_field": "date",
"fixed_interval": "60m"
}
},
{
"terms": {
"source_field": "updateType"
}
}
],
"metrics": [
{
"source_field": "date",
"metrics": [
{
"value_count": {}
},
{
"sum": {}
}
]
},
{
"source_field": "updateType",
"metrics": [
{
"value_count": {}
}
]
}
]
I added 'updateType' value_count metrics which can be added even if it's field type is not a numerics.
and i request like this.
GET example_rollup2/_search
{
"size": 0,
"aggs": {
"types": {
"terms": {
"field": "updateType",
"size": 10
},
"aggs": {
"count": {
"value_count": {
"field": "updateType"
}
}
}
}
}
}
and then response comes with like below.
"aggregations" : {
"types" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "insert",
"doc_count" : 1,
"count" : {
"value" : 2
}
},
{
"key" : "update",
"doc_count" : 1,
"count" : {
"value" : 1
}
}
]
}
}
Problem seems completely solved. But it's not...
I also want to visualize this rolled up index with kibana.
But in kibana visualization, there's no way to process metrics aggregation in buckets of terms aggregation.
So i'm back to square one...
Question is, How can i get terms aggregation doc_count in rolled up index ? ?