Hi ya, I'm working on spec'ing out a new datastore for aggregate calculations for near-real time dashboards and I'm evaluating Elasticsearch.
I've read here that median is usually approximated: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-percentile-aggregation.html#search-aggregations-metrics-percentile-aggregation-approximation
In my dataset, I'm using routing and was wondering if there was a way to make the median more accurate in Elasticsearch, especially given that the number of documents will be in the thousands and all on the same node (for large orgs, the number of documents may reach a few hundred thousand at most but that is very rare). The high relative error in median is a bit worrying.
(Note: If anyone has any suggestions for an alternative aggregation datastore I'd take that too! )
Thanks!
Patrick