I am struggling a bit to understand which ES operations use up which part
of memory
From reading the documentation I know that faceting and sorting take up
heap memory and also that some filters are cached in JVM heap. However,
what other operations take up JVM heap memory?
When faceting on a particular field are all of the values of that field
loaded into JVM heap or only the distinct values of the field. I would
believe it is the latter as fields of high cardinality require more JVM
heap.
What operations use OS memory. I believe documents retrieved and ES index
are kept in OS memory. Can anyone confirm this? What other operations go to
OS memory?
Finally, I would also be interested if there are any posts or tutorials
that desribce the complete lifecycle of a query from submission of query to
return of resultset.
Fielddata (sorting, facets, some scripting), filter cache (filters), and
id_cache (parent-child) are probably the ones that would affect memory
usage significantly.
OS memory would generally be file system caches for the underlying Lucene
indexes
About your other question, the default query execution is called
query_then_fetch
(Elasticsearch Platform — Find real-time answers at scale | Elastic).
The basic idea is the search (query) is scattered in parallel to all shards
(single replica set) of an index being searched. Then a little bit of
results are gathered back, reduced, and then sorted. Then whatever is the
final reduced list, ES goes back to the final shards and pulls out (fetch)
additional data required to return final hits to the caller.
On Wednesday, March 26, 2014 10:41:44 AM UTC-4, Uli Bethke wrote:
I am struggling a bit to understand which ES operations use up which part
of memory
From reading the documentation I know that faceting and sorting take up
heap memory and also that some filters are cached in JVM heap. However,
what other operations take up JVM heap memory?
When faceting on a particular field are all of the values of that field
loaded into JVM heap or only the distinct values of the field. I would
believe it is the latter as fields of high cardinality require more JVM
heap.
What operations use OS memory. I believe documents retrieved and ES
index are kept in OS memory. Can anyone confirm this? What other operations
go to OS memory?
Finally, I would also be interested if there are any posts or tutorials
that desribce the complete lifecycle of a query from submission of query to
return of resultset.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.