Node tasks and their memory requirements

Taylor_Fort · October 11, 2013, 11:15am

we have an 18 node setup with a single replica, 18 shard index. When we
test out traffic against the ES cluster, we notice gradual jumps in the
filter_cache_size until it hits the memory threshold and then we start to
see filter cache evictions (~10 per query). Currently, we have nothing
setup in our configs that specifies a node task, so every node is handling
queries, searches, and indexing, but we're noticing nodes will start
becoming unresponsive upon the GC heavily cleaning up memory. Note: we
heavily use facets and boolean filters (nature of our query requirements).

I'm referencing this guide page for node tasks...
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-node.html

Questions...

what is the memory impact on a node that is handling only queries (i.e.
if I set it to data : false)?
is the filter cache size primarily related to the node that does the
scatter and gather search, the node that has the actual data, or both? is
adding a replica on a data node essentially adding n + 1 requirements for
my filter cache size?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

spinscale · October 14, 2013, 7:43am

Hey,

to your questions (and hopefully a bit beyond)

A non-data node (so a master node or a client node) is handling the
reduce phase during a search. So if you search over 5 shards, you will get
back a certain amount of results from each shard, which are then reduced on
the node, where the search originated from. So if you do deep paging (which
you should not do), you could also put a non-data node under memory
pressure. Apart from that memory requirements are much lower on these nodes
as they do not need fielddata or filter caches.
Filter caches are per lucene segment and lucene segments require data to
be stored.

Do you monitor your system? Can you identify the reasons for the long GC
runs (I guess it is most likely fielddata, so you should monitor the size
for each field you are faceting on, you might want to filter by frequency
or regex to decrease data loaded into memory). You can also increase the
filter cache size if you think it is not enough, see

Hope this helps. Otherwise feel free to ask for more details

--Alex

On Fri, Oct 11, 2013 at 1:15 PM, Taylor Fort taylor.fort@gmail.com wrote:

we have an 18 node setup with a single replica, 18 shard index. When we
test out traffic against the ES cluster, we notice gradual jumps in the
filter_cache_size until it hits the memory threshold and then we start to
see filter cache evictions (~10 per query). Currently, we have nothing
setup in our configs that specifies a node task, so every node is handling
queries, searches, and indexing, but we're noticing nodes will start
becoming unresponsive upon the GC heavily cleaning up memory. Note: we
heavily use facets and boolean filters (nature of our query requirements).

I'm referencing this guide page for node tasks...
Elasticsearch Platform — Find real-time answers at scale | Elastic

Questions...

what is the memory impact on a node that is handling only queries (i.e.
if I set it to data : false)?

is the filter cache size primarily related to the node that does the
scatter and gather search, the node that has the actual data, or both? is
adding a replica on a data node essentially adding n + 1 requirements for
my filter cache size?

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

otisg · October 15, 2013, 4:13am

Hi,

To add a note to Alex's mention of system monitoring. You may want to peek
at the JVM Memory report in SPM which will tell you what's going on with
different memory pools. Stick that on a Dashboard next to the GC graph and
you should be able to see correlation pretty clearly and that will tell you
which pool you could make bigger. If you want, you can email your graphs
to this list directly from SPM - look for the little ambulance icon above
graphs.

We've been using G1 a lot lately.
See How to Tune Java Garbage Collection - Sematext

Otis

ELASTICSEARCH Performance Monitoring - Sematext Monitoring | Infrastructure Monitoring Service
Search Analytics - Cloud Monitoring Tools & Services | Sematext

On Monday, October 14, 2013 3:43:20 AM UTC-4, Alexander Reelsen wrote:

Hey,

to your questions (and hopefully a bit beyond)

A non-data node (so a master node or a client node) is handling the
reduce phase during a search. So if you search over 5 shards, you will get
back a certain amount of results from each shard, which are then reduced on
the node, where the search originated from. So if you do deep paging (which
you should not do), you could also put a non-data node under memory
pressure. Apart from that memory requirements are much lower on these nodes
as they do not need fielddata or filter caches.

Filter caches are per lucene segment and lucene segments require data
to be stored.

Do you monitor your system? Can you identify the reasons for the long GC
runs (I guess it is most likely fielddata, so you should monitor the size
for each field you are faceting on, you might want to filter by frequency
or regex to decrease data loaded into memory). You can also increase the
filter cache size if you think it is not enough, see

Elasticsearch Platform — Find real-time answers at scale | Elastic

Elasticsearch Platform — Find real-time answers at scale | Elastic

Elasticsearch Platform — Find real-time answers at scale | Elastic

Elasticsearch Platform — Find real-time answers at scale | Elastic

Hope this helps. Otherwise feel free to ask for more details

--Alex

On Fri, Oct 11, 2013 at 1:15 PM, Taylor Fort <taylo...@gmail.com<javascript:>

wrote:

we have an 18 node setup with a single replica, 18 shard index. When we
test out traffic against the ES cluster, we notice gradual jumps in the
filter_cache_size until it hits the memory threshold and then we start to
see filter cache evictions (~10 per query). Currently, we have nothing
setup in our configs that specifies a node task, so every node is handling
queries, searches, and indexing, but we're noticing nodes will start
becoming unresponsive upon the GC heavily cleaning up memory. Note: we
heavily use facets and boolean filters (nature of our query requirements).

I'm referencing this guide page for node tasks...
Elasticsearch Platform — Find real-time answers at scale | Elastic

Questions...

what is the memory impact on a node that is handling only queries
(i.e. if I set it to data : false)?

is the filter cache size primarily related to the node that does the
scatter and gather search, the node that has the actual data, or both? is
adding a replica on a data node essentially adding n + 1 requirements for
my filter cache size?

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Single node takes down entire cluster Elasticsearch	5	2258	July 6, 2017
Optimizing shard size and count per node Elasticsearch	1	332	July 6, 2017
Poor performance and a lot of GC overhead Elasticsearch	19	13574	September 7, 2018
Node uses too much memory, I think Elasticsearch	4	624	July 6, 2017
How to determine optimum RAM for an elasticsearch node Elasticsearch	5	342	July 6, 2017

Node tasks and their memory requirements

Otis

Related topics