genichm
(Genady Mironenko)
November 13, 2018, 7:00am
1
Does it true that in moment of heavy load on cluster Elasticsearch can make inaccurate aggregations?
It has to do with scale, not load. Some type of aggregations are approximates as calculating exact values fast at large scale is not possible. Examples are the cardinality aggregation , significant terms aggregations and also the terms aggregation (in cases of high cardinality).
Have a look at the following threads that discuss this:
Hi,
We're about to start a POC with elastic search.
Prior to posting here, I read so many blogs/answers/topics/... regarding the fact ES is fast, but can return incorrect or approximate results...
What we need to be able to achieve is complex aggregations and "stats" in general on a fairly big amount of data.
We of course need averages, Min, Max, counts, group by, etc. to be precise & exact. We can't miss out a few records.
Is this going to be a problem with ES?
Thanks a lot for your help
…
It's a general issue with distributed analytics
I always say it comes down to a pick 2-of-3 conundrum between Fast Accurate and Big. It's about trade-offs.
If you wanted BA you'd use something like Hadoop and stream all term frequencies in a series of map-reduce phases until you got your answer. Accurate, but takes a while and may end up being the answer to yesterday's question.
We do BF which means we do things fast but often using sketches of data from shards which can lose a…
system
(system)
Closed
December 11, 2018, 7:19am
3
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.