Memory Usage in (Date) Histogram Facet

otisg · April 20, 2012, 10:56pm

Hi,

Reading "Memory Considerations" section at the bottom of
http://www.elasticsearch.org/guide/reference/api/search/facets/histogram-facet.html ....
and want to confirm:

Just because you use the interval functionality to group values from a
field in buckets does not reduce the amount of memory needed.
For example, if there are 1000 distinct values in field X, if you then do a
histogram facet on X and bucket values into 100 buckets, the memory needed
to hold this is still the same as if bu interval was used for bucketing.

In other words, even if you "bucketize", it is worth reducing the number of
distinct values in a field by, for example, rounding up timestamps to
minutes or hours or even days.

Is the above all correct?

Thanks,
Otis

Performance Monitoring for ES -
http://sematext.com/spm/elasticsearch-performance-monitoring

drewr · April 21, 2012, 3:21am

Otis Gospodnetic wrote:

Reading "Memory Considerations" section at the bottom
of Elasticsearch Platform — Find real-time answers at scale | Elastic
facet.html .... and want to confirm:

Just because you use the interval functionality to group values from a
field in buckets does not reduce the amount of memory needed.

[...]

In other words, even if you "bucketize", it is worth reducing the
number of distinct values in a field by, for example, rounding up
timestamps to minutes or hours or even days.

Is the above all correct?

You are correct. Regardless of the interval, FieldDataLoader.load()
will still populate a sparse matrix for every lucene segment defined
by maximum docs in the segment by the number of unique terms in the
field across all documents. And it gets worse once those segments
are merged into fewer ones.

facets OutofMemoryError · Issue #1531 · elastic/elasticsearch · GitHub
Question/comment about multi-value field data construction · Issue #1683 · elastic/elasticsearch · GitHub

-Drew

Topic		Replies	Views
Date Histogram Facet and interval. How does it work under covers? Elasticsearch	3	377	July 6, 2017
Consolidate facet search knowledge about memory usage Elasticsearch	6	387	July 6, 2017
Estimating field cache size for facets in advance Elasticsearch	11	518	July 6, 2017
Term facet memory consideration in the documentation Elasticsearch	3	382	July 6, 2017
Date histogram facet problem: bucket distribution Elasticsearch	1	337	July 6, 2017

Memory Usage in (Date) Histogram Facet

Thanks, Otis

Related topics

Thanks,
Otis