How to prevent ES from garbage collecting during several minutes

Hello,

We have an elastic search cluster storing logs with the following
configuration:

  • One index per day, we keep only the last 15 days
  • Each index is ~8Gb (slightly increasing)
  • The cluster is composed of 2 nodes that are 2 VMs with 6GB of RAM each
  • Nodes runs ES 0.20.2 and have the following config:
    cluster.name: eslogcluster
    node.name: "eslogcluster_2"
    index.number_of_shards: 2
    index.number_of_replicas: 1
    path.data: /usr/local/share/elasticsearch/data
    index.cache.field.max_size: 2500
    index.cache.field.expire: 10m
    index.cache.field.type: soft
  • JVM arguments are the following:
    /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java
    -Delasticsearch-service
    -Des.path.home=/usr/local/share/elasticsearch-0.20.2
    -Xss256k
    -XX:+UseParNewGC
    -XX:+UseConcMarkSweepGC
    -XX:CMSInitiatingOccupancyFraction=75
    -XX:+UseCMSInitiatingOccupancyOnly
    -XX:+HeapDumpOnOutOfMemoryError
    -Djava.awt.headless=true
    -Xms3072m
    -Xmx3072m
    [...]
  • the health of the cluster:
    "cluster_name" : "eslogcluster",
    "status" : "green",
    "timed_out" : false,
    "number_of_nodes" : 3,
    "number_of_data_nodes" : 2,
    "active_primary_shards" : 30,
    "active_shards" : 60,
    "relocating_shards" : 0,
    "initializing_shards" : 0,
    "unassigned_shards" : 0

Our problem is that sometimes ES starts garbage collecting lot during a few
minutes. While it is happening, we cannot do any query.
We get logs like:
[2013-02-25 13:56:38,861][INFO ][monitor.jvm ]
[eslogcluster_1] [gc][ConcurrentMarkSweep][340948][4742] duration [5.1s],
collections [1]/[5.6s], total [5.1s]/[33.6m], memory
[2.9gb]->[2.7gb]/[2.9gb], all_pools {[Code Cache]
[9.3mb]->[9.3mb]/[48mb]}{[Par Eden Space]
[266.2mb]->[71.1mb]/[266.2mb]}{[Par Survivor Space]
[33.2mb]->[0b]/[33.2mb]}{[CMS Old Gen] [2.6gb]->[2.6gb]/[2.6gb]}{[CMS Perm
Gen] [32.8mb]->[32.8mb]/[166mb]}
[2013-02-25 13:56:59,043][INFO ][monitor.jvm ]
[eslogcluster_1] [gc][ConcurrentMarkSweep][340953][4746] duration [5.1s],
collections [1]/[5.6s], total [5.1s]/[33.9m], memory
[2.7gb]->[2.8gb]/[2.9gb], all_pools {[Code Cache]
[9.3mb]->[9.3mb]/[48mb]}{[Par Eden Space]
[128.5mb]->[144.8mb]/[266.2mb]}{[Par Survivor Space]
[0b]->[0b]/[33.2mb]}{[CMS Old Gen] [2.6gb]->[2.6gb]/[2.6gb]}{[CMS Perm Gen]
[32.8mb]->[32.8mb]/[166mb]}
[2013-02-25 13:57:05,711][INFO ][monitor.jvm ]
[eslogcluster_1] [gc][ConcurrentMarkSweep][340955][4747] duration [5.1s],
collections [1]/[5.6s], total [5.1s]/[34m], memory
[2.9gb]->[2.7gb]/[2.9gb], all_pools {[Code Cache]
[9.3mb]->[9.3mb]/[48mb]}{[Par Eden Space]
[266.2mb]->[126mb]/[266.2mb]}{[Par Survivor Space]
[33.2mb]->[0b]/[33.2mb]}{[CMS Old Gen] [2.6gb]->[2.6gb]/[2.6gb]}{[CMS Perm
Gen] [32.8mb]->[32.8mb]/[166mb]}
...

Do you know what could be done to improve this? Using "index.cache.field.*"
parameters helped quite a lot to make to problem less frequent but it keeps
happening. Must we add more RAM? Should we increase java heap size? (ES is
not alone on the machines, we set heap size to only 3G to be sure it does
not swap)

Thanks

--
Quentin Barbe

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Can you give an example how your facet and facet queries look like, and
how much queries per second you want to run?

Just from looking at the memory numbers in the GC log, you are simply
tight on resources, the JVM heap is full.

Jörg

Am 25.02.13 16:51, schrieb Quentin Barbe:

memory [2.9gb]->[2.7gb]/[2.9gb],

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hello Jörg,

Thank you for your answer. I don't think we use facet at all: we use
logstash to insert data and mostly kibana to read them (we don't use the
"terms" feature of kibana which I believe uses facet).
The queries we run:

  • We have about 14 000 000 document per day, so let's say an average of 160
    insertions per seconds in the last index. It is done by logstash so I don't
    know exactly what it looks like.
  • We have some monitoring service that query the content of the last
    indexes with query like:
    {"filter":{"bool":{"must":[{"range":{"@timestamp":{"gte":1361810627448,"lte":1361811827448}}},{"term":{"filename":"XXX"}}],"must_not":[],"should":[]}}}
    There must be something like 10/20 such a query per minute.
  • Manual requests in Kibana. Those are rarer and are generally more complex
    and involve serveral indexes (we the "last X days" filter of Kibana). There
    might be a correlation between when we run those more complex queries and
    when ES starts GCing (it does not always happen at the same time, but it is
    often the case). Example Kibana request: login:"XXXX" AND
    @message:"numberRemove" AND @fields.nbEdit:[2 TO *] (on 7 days, so 7
    indexes for example)

What do you think?

Thanks

--
Quentin

Le lundi 25 février 2013 17:40:32 UTC+1, Jörg Prante a écrit :

Can you give an example how your facet and facet queries look like, and
how much queries per second you want to run?

Just from looking at the memory numbers in the GC log, you are simply
tight on resources, the JVM heap is full.

Jörg

Am 25.02.13 16:51, schrieb Quentin Barbe:

memory [2.9gb]->[2.7gb]/[2.9gb],

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I understand you have 2 nodes on VM (or is it 3?), with 3 GB heap each,
with 100-200 dps on the indexing side and ~1 qps at the query side, with
range queries (on millisecond level) and large cache for bool filter.

My assumption is that, after a while, search is competing harder with
indexing for the heap especially when large segment merging comes up.
But I don't know the avg query response time to be sure, I guess
response times suffer, too. Maybe it was just a single heavy query that
caused heap shortage.

If it's not only a single query, without having good knowledge of long
running Kibana workloads and queries, I would simply recommend to add
nodes. It is easy to do, and workload is better distributed.

If this does not relief the situation, you could look into either into
bulk indexing response times - idea is to reduce memory consumption by
controlling the volume if bulk response is getting slower over time, but
if you must index logs in realtime, you have no choice or you lag behind
too much - or into the field cache configuration. I'm afraid there is
not much to improve at the field cache side, since query load is not
high, maybe Kibana index fields and filtering can be remodeled, for a
bit better memory usage, I'm not sure.

My personal experience is rather sad with free form query submission.
One false query with extremely large footprint exceeding the cluster
capacity, and you can cause a node GC to go near to a halt. Maybe also
previous queries got stuck and resources went accumulated over time.
Then, nodes should be restarted to reset the heap usage turbulence.
Isolating and blocking such potentially "dangerous" queries would be a
good idea.

Jörg

Am 25.02.13 18:21, schrieb Quentin Barbe:

Hello Jörg,

Thank you for your answer. I don't think we use facet at all: we use
logstash to insert data and mostly kibana to read them (we don't use
the "terms" feature of kibana which I believe uses facet).
The queries we run:

  • We have about 14 000 000 document per day, so let's say an average
    of 160 insertions per seconds in the last index. It is done by
    logstash so I don't know exactly what it looks like.
  • We have some monitoring service that query the content of the last
    indexes with query like:
    {"filter":{"bool":{"must":[{"range":{"@timestamp":{"gte":1361810627448,"lte":1361811827448}}},{"term":{"filename":"XXX"}}],"must_not":[],"should":[]}}}
    There must be something like 10/20 such a query per minute.
  • Manual requests in Kibana. Those are rarer and are generally more
    complex and involve serveral indexes (we the "last X days" filter of
    Kibana). There might be a correlation between when we run those more
    complex queries and when ES starts GCing (it does not always happen at
    the same time, but it is often the case). Example Kibana request:
    login:"XXXX" AND @message:"numberRemove" AND @fields.nbEdit:[2 TO *]
    (on 7 days, so 7 indexes for example)

What do you think?

Thanks

--
Quentin

Le lundi 25 février 2013 17:40:32 UTC+1, Jörg Prante a écrit :

Can you give an example how your facet and facet queries look
like, and
how much queries per second you want to run?

Just from looking at the memory numbers in the GC log, you are simply
tight on resources, the JVM heap is full.

Jörg

Am 25.02.13 16:51, schrieb Quentin Barbe:
> memory [2.9gb]->[2.7gb]/[2.9gb],

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Okay, thank you very much, we'll try to add a 3rd node, to see if it
happens again.

Regarding the "dangerous query" blocking, do you know any way to block
query? Is there a way to ask ES not to use to much memory for a single
query? (just fail the query instead for example)

--
Quentin

Le lundi 25 février 2013 19:06:25 UTC+1, Jörg Prante a écrit :

I understand you have 2 nodes on VM (or is it 3?), with 3 GB heap each,
with 100-200 dps on the indexing side and ~1 qps at the query side, with
range queries (on millisecond level) and large cache for bool filter.

My assumption is that, after a while, search is competing harder with
indexing for the heap especially when large segment merging comes up.
But I don't know the avg query response time to be sure, I guess
response times suffer, too. Maybe it was just a single heavy query that
caused heap shortage.

If it's not only a single query, without having good knowledge of long
running Kibana workloads and queries, I would simply recommend to add
nodes. It is easy to do, and workload is better distributed.

If this does not relief the situation, you could look into either into
bulk indexing response times - idea is to reduce memory consumption by
controlling the volume if bulk response is getting slower over time, but
if you must index logs in realtime, you have no choice or you lag behind
too much - or into the field cache configuration. I'm afraid there is
not much to improve at the field cache side, since query load is not
high, maybe Kibana index fields and filtering can be remodeled, for a
bit better memory usage, I'm not sure.

My personal experience is rather sad with free form query submission.
One false query with extremely large footprint exceeding the cluster
capacity, and you can cause a node GC to go near to a halt. Maybe also
previous queries got stuck and resources went accumulated over time.
Then, nodes should be restarted to reset the heap usage turbulence.
Isolating and blocking such potentially "dangerous" queries would be a
good idea.

Jörg

Am 25.02.13 18:21, schrieb Quentin Barbe:

Hello Jörg,

Thank you for your answer. I don't think we use facet at all: we use
logstash to insert data and mostly kibana to read them (we don't use
the "terms" feature of kibana which I believe uses facet).
The queries we run:

  • We have about 14 000 000 document per day, so let's say an average
    of 160 insertions per seconds in the last index. It is done by
    logstash so I don't know exactly what it looks like.
  • We have some monitoring service that query the content of the last
    indexes with query like:

{"filter":{"bool":{"must":[{"range":{"@timestamp":{"gte":1361810627448,"lte":1361811827448}}},{"term":{"filename":"XXX"}}],"must_not":[],"should":[]}}}

There must be something like 10/20 such a query per minute.

  • Manual requests in Kibana. Those are rarer and are generally more
    complex and involve serveral indexes (we the "last X days" filter of
    Kibana). There might be a correlation between when we run those more
    complex queries and when ES starts GCing (it does not always happen at
    the same time, but it is often the case). Example Kibana request:
    login:"XXXX" AND @message:"numberRemove" AND @fields.nbEdit:[2 TO *]
    (on 7 days, so 7 indexes for example)

What do you think?

Thanks

--
Quentin

Le lundi 25 février 2013 17:40:32 UTC+1, Jörg Prante a écrit :

Can you give an example how your facet and facet queries look 
like, and 
how much queries per second you want to run? 

Just from looking at the memory numbers in the GC log, you are 

simply

tight on resources, the JVM heap is full. 

Jörg 

Am 25.02.13 16:51, schrieb Quentin Barbe: 
> memory [2.9gb]->[2.7gb]/[2.9gb], 

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Unfortunately I know no method to precompute the resources a query may
require, but you asked a very important question for a wide range of
future improvements. I think it is possible to develop a query cost
estimator for ES so erratic resource consumption can be detected early.

Jörg

Am 25.02.13 19:20, schrieb Quentin Barbe:

Regarding the "dangerous query" blocking, do you know any way to block
query? Is there a way to ask ES not to use to much memory for a single
query? (just fail the query instead for example)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

As Jörg said:

Just from looking at the memory numbers in the GC log, you are simply
tight on resources, the JVM heap is full.

3GB is not much, especially with the typical use cases for Kibana (eg
sorting on timestamps can use a lot of memory)

You can scale up and out with Elasticsearch. I'd recommend starting by
giving it more RAM.

Also, you want to make sure that you are never swapping as it kills
garbage collection on the JVM. Set swapiness to zero, use
bootstrap.mlockall (plus the associated ulimit -l settings) or turn swap
off completely.

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.