Node tasks and their memory requirements


(Taylor Fort) #1

we have an 18 node setup with a single replica, 18 shard index. When we
test out traffic against the ES cluster, we notice gradual jumps in the
filter_cache_size until it hits the memory threshold and then we start to
see filter cache evictions (~10 per query). Currently, we have nothing
setup in our configs that specifies a node task, so every node is handling
queries, searches, and indexing, but we're noticing nodes will start
becoming unresponsive upon the GC heavily cleaning up memory. Note: we
heavily use facets and boolean filters (nature of our query requirements).

I'm referencing this guide page for node tasks...
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-node.html

Questions...

  1. what is the memory impact on a node that is handling only queries (i.e.
    if I set it to data : false)?

  2. is the filter cache size primarily related to the node that does the
    scatter and gather search, the node that has the actual data, or both? is
    adding a replica on a data node essentially adding n + 1 requirements for
    my filter cache size?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Alexander Reelsen) #2

Hey,

to your questions (and hopefully a bit beyond)

  1. A non-data node (so a master node or a client node) is handling the
    reduce phase during a search. So if you search over 5 shards, you will get
    back a certain amount of results from each shard, which are then reduced on
    the node, where the search originated from. So if you do deep paging (which
    you should not do), you could also put a non-data node under memory
    pressure. Apart from that memory requirements are much lower on these nodes
    as they do not need fielddata or filter caches.

  2. Filter caches are per lucene segment and lucene segments require data to
    be stored.

Do you monitor your system? Can you identify the reasons for the long GC
runs (I guess it is most likely fielddata, so you should monitor the size
for each field you are faceting on, you might want to filter by frequency
or regex to decrease data loaded into memory). You can also increase the
filter cache size if you think it is not enough, see

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-nodes-stats.html#field-data
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#fielddata-filters
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-fielddata.html#_filtering_by_frequency
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-cache.html#filter

Hope this helps. Otherwise feel free to ask for more details :slight_smile:

--Alex

On Fri, Oct 11, 2013 at 1:15 PM, Taylor Fort taylor.fort@gmail.com wrote:

we have an 18 node setup with a single replica, 18 shard index. When we
test out traffic against the ES cluster, we notice gradual jumps in the
filter_cache_size until it hits the memory threshold and then we start to
see filter cache evictions (~10 per query). Currently, we have nothing
setup in our configs that specifies a node task, so every node is handling
queries, searches, and indexing, but we're noticing nodes will start
becoming unresponsive upon the GC heavily cleaning up memory. Note: we
heavily use facets and boolean filters (nature of our query requirements).

I'm referencing this guide page for node tasks...
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-node.html

Questions...

  1. what is the memory impact on a node that is handling only queries (i.e.
    if I set it to data : false)?

  2. is the filter cache size primarily related to the node that does the
    scatter and gather search, the node that has the actual data, or both? is
    adding a replica on a data node essentially adding n + 1 requirements for
    my filter cache size?

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Otis Gospodnetić) #3

Hi,

To add a note to Alex's mention of system monitoring. You may want to peek
at the JVM Memory report in SPM which will tell you what's going on with
different memory pools. Stick that on a Dashboard next to the GC graph and
you should be able to see correlation pretty clearly and that will tell you
which pool you could make bigger. If you want, you can email your graphs
to this list directly from SPM - look for the little ambulance icon above
graphs.

We've been using G1 a lot lately.
See http://blog.sematext.com/2013/06/24/g1-cms-java-garbage-collector/

Otis

ELASTICSEARCH Performance Monitoring - http://sematext.com/spm/index.html
Search Analytics - http://sematext.com/search-analytics/index.html

On Monday, October 14, 2013 3:43:20 AM UTC-4, Alexander Reelsen wrote:

Hey,

to your questions (and hopefully a bit beyond)

  1. A non-data node (so a master node or a client node) is handling the
    reduce phase during a search. So if you search over 5 shards, you will get
    back a certain amount of results from each shard, which are then reduced on
    the node, where the search originated from. So if you do deep paging (which
    you should not do), you could also put a non-data node under memory
    pressure. Apart from that memory requirements are much lower on these nodes
    as they do not need fielddata or filter caches.

  2. Filter caches are per lucene segment and lucene segments require data
    to be stored.

Do you monitor your system? Can you identify the reasons for the long GC
runs (I guess it is most likely fielddata, so you should monitor the size
for each field you are faceting on, you might want to filter by frequency
or regex to decrease data loaded into memory). You can also increase the
filter cache size if you think it is not enough, see

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-nodes-stats.html#field-data

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#fielddata-filters

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-fielddata.html#_filtering_by_frequency

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-cache.html#filter

Hope this helps. Otherwise feel free to ask for more details :slight_smile:

--Alex

On Fri, Oct 11, 2013 at 1:15 PM, Taylor Fort <taylo...@gmail.com<javascript:>

wrote:

we have an 18 node setup with a single replica, 18 shard index. When we
test out traffic against the ES cluster, we notice gradual jumps in the
filter_cache_size until it hits the memory threshold and then we start to
see filter cache evictions (~10 per query). Currently, we have nothing
setup in our configs that specifies a node task, so every node is handling
queries, searches, and indexing, but we're noticing nodes will start
becoming unresponsive upon the GC heavily cleaning up memory. Note: we
heavily use facets and boolean filters (nature of our query requirements).

I'm referencing this guide page for node tasks...
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-node.html

Questions...

  1. what is the memory impact on a node that is handling only queries
    (i.e. if I set it to data : false)?

  2. is the filter cache size primarily related to the node that does the
    scatter and gather search, the node that has the actual data, or both? is
    adding a replica on a data node essentially adding n + 1 requirements for
    my filter cache size?

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #4