Single-value metrics aggregations

#1

Hi Elastic,

I want to learn how Elasticsearch does the simplest single-value metrics aggregation (count, sum, avg, max, min) on large data. Does each shard aggregate first and the final node aggregates the shard-level aggregates? Or each shard sends all matched documents to one node and the node does aggregation? If there is link to the code, I can also read that. Thank you!

#2

Here is an example query.

{
    'aggs' : {
        'ok_rate' : {
            'avg' : {
                'script' : {
                    'source' : 'doc.status.value == 200 ? 1 : 0'
                }
            }
        }
    },
    'query': {
        'bool' : {
            'filter' : [{
                'range' : {
                    '@timestamp' : {
                        'gte' : time,
                        'lte' : time,
                        'format' : 'epoch_second'
                    }
                }
            }]
        }
    },
    "size" : 0
}
(Mark Harwood) #3

Hi cecoke,

No, that's not a scalable approach. Each shard returns interim aggregation state which is merged by a coordinating node to produce the final result. Example interim state for average is here: https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/search/aggregations/metrics/InternalAvg.java#L90

#4

Thank you so much for your help, Mark!

Closing.