Problem with heap space overusage

hi,

we have 6 servers and 14 shards in cluster, the index size 26GB, we have 1
replica so total size is 52GB, and ES v1.4.0, java version "1.7.0_65"

we use servers with RAM of 14GB (m3.xlarge), and heap is set to 7GB

around week ago we started facing next issue:

random cluster servers around once per day/two are hitting the heap size
limit (java.lang.OutOfMemoryError: Java heap space) in log, and cluster is
failing - becomes red or yellow

we tried adding more servers to cluster - even 8, but than it's a matter of
time when we'll hit the problem, so looks no matter how many servers are in
cluster - it will still hit the limit after some time

before we started facing the problem we were running smoothly with 3 servers
also we set indices.fielddata.cache.size: 40% but it didnt helped

also, there are possible workarounds to decrease heap usage:

  1. reboot some server - than heap becomes under 70% and for some time
    cluster is ok

or

  1. decrease number of replicas to 0, and than back to 1

but I dont like to use those workarounds

how it can happen while all index can fit into RAM it can run out of it?

thanks much for possible help

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2ae23017-fde7-4b10-b31b-39076b079f10%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

What sort of data is it, what sort of queries are you running and how often
are they run?

On 19 November 2014 17:52, tetlika tetlika@gmail.com wrote:

hi,

we have 6 servers and 14 shards in cluster, the index size 26GB, we have 1
replica so total size is 52GB, and ES v1.4.0, java version "1.7.0_65"

we use servers with RAM of 14GB (m3.xlarge), and heap is set to 7GB

around week ago we started facing next issue:

random cluster servers around once per day/two are hitting the heap size
limit (java.lang.OutOfMemoryError: Java heap space) in log, and cluster is
failing - becomes red or yellow

we tried adding more servers to cluster - even 8, but than it's a matter
of time when we'll hit the problem, so looks no matter how many servers are
in cluster - it will still hit the limit after some time

before we started facing the problem we were running smoothly with 3
servers
also we set indices.fielddata.cache.size: 40% but it didnt helped

also, there are possible workarounds to decrease heap usage:

  1. reboot some server - than heap becomes under 70% and for some time
    cluster is ok

or

  1. decrease number of replicas to 0, and than back to 1

but I dont like to use those workarounds

how it can happen while all index can fit into RAM it can run out of it?

thanks much for possible help

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2ae23017-fde7-4b10-b31b-39076b079f10%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/2ae23017-fde7-4b10-b31b-39076b079f10%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAF3ZnZk%3DhheVkWnnH%2B3%3DkE5pRm0EXwFz39LppdJ%3DU0gEw3LXOw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

I have similar problems with Java Garbage Collection. In my case, gc
throughput is less than generated garbage from
In my opinion, best way is to generate less garbage. This is the largest
elasticsearch allocation problems:

  1. using "_source" field to store document source inside elastic -
    lucene generates a lot of garbage when allocating heap byte buffers for
    read/write document sources. Try not to use it, and get data from primary
    storage like sql database
  2. http/tcp buffers - try increase transport settings, like:
    http.max_chunk_size: 128kb, max_composite_buffer_components: 65536 (this is
    a netty related settings for better reuse allocated buffers and prevent
    copying composite buffers to simple heap byte buffers )
  3. field data an filter caches - try set less memory size for it

среда, 19 ноября 2014 г., 9:52:19 UTC+3 пользователь tetlika написал:

hi,

we have 6 servers and 14 shards in cluster, the index size 26GB, we have 1
replica so total size is 52GB, and ES v1.4.0, java version "1.7.0_65"

we use servers with RAM of 14GB (m3.xlarge), and heap is set to 7GB

around week ago we started facing next issue:

random cluster servers around once per day/two are hitting the heap size
limit (java.lang.OutOfMemoryError: Java heap space) in log, and cluster is
failing - becomes red or yellow

we tried adding more servers to cluster - even 8, but than it's a matter
of time when we'll hit the problem, so looks no matter how many servers are
in cluster - it will still hit the limit after some time

before we started facing the problem we were running smoothly with 3
servers
also we set indices.fielddata.cache.size: 40% but it didnt helped

also, there are possible workarounds to decrease heap usage:

  1. reboot some server - than heap becomes under 70% and for some time
    cluster is ok

or

  1. decrease number of replicas to 0, and than back to 1

but I dont like to use those workarounds

how it can happen while all index can fit into RAM it can run out of it?

thanks much for possible help

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8d1a860e-4e19-4895-9fbc-e575da38889f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

We have contact profiles (20+ fields, containing nested documents) indexed
and their social profiles(10+ fields) indexed as child documents of contact
profile.
We run complex bool match queries, delete by query, delete children by
query, faceting queries on contact profiles.
index rate 14.31op/s
remove by query rate 13.41op/s (such high value caused by fact we delete
all child docs first before indexing of parent and then we index children
again)
search rate 2.53op/s
remove by ids 0.15op/s

We started to face this trouble under ES 1.2 but just after we started to
index and delete (no searching requests yet) child documents. On ES 1.4 we
have the same issue.

What sort of data is it, what sort of queries are you running and how often

are they run?

On 19 November 2014 17:52, tetlika <tet...@gmail.com <javascript:>> wrote:

hi,

we have 6 servers and 14 shards in cluster, the index size 26GB, we have
1 replica so total size is 52GB, and ES v1.4.0, java version "1.7.0_65"

we use servers with RAM of 14GB (m3.xlarge), and heap is set to 7GB

around week ago we started facing next issue:

random cluster servers around once per day/two are hitting the heap size
limit (java.lang.OutOfMemoryError: Java heap space) in log, and cluster is
failing - becomes red or yellow

we tried adding more servers to cluster - even 8, but than it's a matter
of time when we'll hit the problem, so looks no matter how many servers are
in cluster - it will still hit the limit after some time

before we started facing the problem we were running smoothly with 3
servers
also we set indices.fielddata.cache.size: 40% but it didnt helped

also, there are possible workarounds to decrease heap usage:

  1. reboot some server - than heap becomes under 70% and for some time
    cluster is ok

or

  1. decrease number of replicas to 0, and than back to 1

but I dont like to use those workarounds

how it can happen while all index can fit into RAM it can run out of it?

thanks much for possible help

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2ae23017-fde7-4b10-b31b-39076b079f10%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/2ae23017-fde7-4b10-b31b-39076b079f10%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b65421e3-b000-40e7-bd31-f9d3064c8e49%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

anyone?

Середа, 19 листопада 2014 р. 13:32:37 UTC+1 користувач Serg Fillipenko
написав:

We have contact profiles (20+ fields, containing nested documents) indexed
and their social profiles(10+ fields) indexed as child documents of contact
profile.
We run complex bool match queries, delete by query, delete children by
query, faceting queries on contact profiles.
index rate 14.31op/s
remove by query rate 13.41op/s (such high value caused by fact we delete
all child docs first before indexing of parent and then we index children
again)
search rate 2.53op/s
remove by ids 0.15op/s

We started to face this trouble under ES 1.2 but just after we started to
index and delete (no searching requests yet) child documents. On ES 1.4 we
have the same issue.

What sort of data is it, what sort of queries are you running and how

often are they run?

On 19 November 2014 17:52, tetlika tet...@gmail.com wrote:

hi,

we have 6 servers and 14 shards in cluster, the index size 26GB, we have
1 replica so total size is 52GB, and ES v1.4.0, java version "1.7.0_65"

we use servers with RAM of 14GB (m3.xlarge), and heap is set to 7GB

around week ago we started facing next issue:

random cluster servers around once per day/two are hitting the heap size
limit (java.lang.OutOfMemoryError: Java heap space) in log, and cluster is
failing - becomes red or yellow

we tried adding more servers to cluster - even 8, but than it's a matter
of time when we'll hit the problem, so looks no matter how many servers are
in cluster - it will still hit the limit after some time

before we started facing the problem we were running smoothly with 3
servers
also we set indices.fielddata.cache.size: 40% but it didnt helped

also, there are possible workarounds to decrease heap usage:

  1. reboot some server - than heap becomes under 70% and for some time
    cluster is ok

or

  1. decrease number of replicas to 0, and than back to 1

but I dont like to use those workarounds

how it can happen while all index can fit into RAM it can run out of it?

thanks much for possible help

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2ae23017-fde7-4b10-b31b-39076b079f10%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/2ae23017-fde7-4b10-b31b-39076b079f10%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/25ffd149-b6ce-45b3-a702-faa512b33f6a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

It could be a number of things. Check your various ES caches. Full?
Correlated with GC activity increase and eventual OOM. Then check your
queries - are they big? Expensive aggregations? (the other day I saw one of
our clients using agg queries 10K lines in size) I could keep asking
questions..... share everything you've got to get help here.

Otis

Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/

On Thursday, November 20, 2014 3:53:24 AM UTC-5, tetlika wrote:

anyone?

Середа, 19 листопада 2014 р. 13:32:37 UTC+1 користувач Serg Fillipenko
написав:

We have contact profiles (20+ fields, containing nested documents)
indexed and their social profiles(10+ fields) indexed as child documents of
contact profile.
We run complex bool match queries, delete by query, delete children by
query, faceting queries on contact profiles.
index rate 14.31op/s
remove by query rate 13.41op/s (such high value caused by fact we
delete all child docs first before indexing of parent and then we index
children again)
search rate 2.53op/s
remove by ids 0.15op/s

We started to face this trouble under ES 1.2 but just after we started to
index and delete (no searching requests yet) child documents. On ES 1.4 we
have the same issue.

What sort of data is it, what sort of queries are you running and how

often are they run?

On 19 November 2014 17:52, tetlika tet...@gmail.com wrote:

hi,

we have 6 servers and 14 shards in cluster, the index size 26GB, we
have 1 replica so total size is 52GB, and ES v1.4.0, java version "1.7.0_65"

we use servers with RAM of 14GB (m3.xlarge), and heap is set to 7GB

around week ago we started facing next issue:

random cluster servers around once per day/two are hitting the heap
size limit (java.lang.OutOfMemoryError: Java heap space) in log, and
cluster is failing - becomes red or yellow

we tried adding more servers to cluster - even 8, but than it's a
matter of time when we'll hit the problem, so looks no matter how many
servers are in cluster - it will still hit the limit after some time

before we started facing the problem we were running smoothly with 3
servers
also we set indices.fielddata.cache.size: 40% but it didnt helped

also, there are possible workarounds to decrease heap usage:

  1. reboot some server - than heap becomes under 70% and for some time
    cluster is ok

or

  1. decrease number of replicas to 0, and than back to 1

but I dont like to use those workarounds

how it can happen while all index can fit into RAM it can run out of it?

thanks much for possible help

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2ae23017-fde7-4b10-b31b-39076b079f10%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/2ae23017-fde7-4b10-b31b-39076b079f10%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e12375e3-3d57-4ae9-9e3b-641ed2862329%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Finally found query causing claster crashes. After I commented code doing
this - claster is ok for few days. Before it it was crashing once day in
average.

Query looks like:

{
"sort": [
{
"user_last_contacted.ct": {
"nested_filter": {
"term": {
"user_last_contacted.owner_id":
"542b2b7fb0bc2244056fd90f"
}
},
"order": "desc",
"missing": "_last"
}
}
],
"query": {
"filtered": {
"filter": {
"term": {
"company_id": "52c0e0b7e0534664db9dfb9a"
}
},
"query": {
"match_all": {}
}
}
},
"explain": false,
"from": 0,
"size": 100
}

mapping looks like:

        "contact": {
            "_all": {
                "type": "string",
                "enabled": true,
                "analyzer": "default_full",
                "index": "analyzed"
            },
            "_routing": {
                "path": "company_id",
                "required": true
            },
            "_source": {
                "enabled": false
            },
            "include_in_all": true,
            "dynamic": false,
            "properties": {
                "user_last_contacted": {
                    "include_in_all": false,
                    "dynamic": false,
                    "type": "nested",
                    "properties": {
                        "ct": {
                            "include_in_all": false,
                            "index": "not_analyzed",
                            "type": "date"
                        },
                        "owner_id": {
                            "type": "string"
                        }
                    }
                }...

This query causes elasticsearch to eat more and more heap and finally if
crashed with java.lang.OutOfMemoryError: Java heap space. Also it's clear
that ES doesn't clear heap after such queries. This is started to occur
after upgrade from ES 0.90

It could be a number of things. Check your various ES caches. Full?

Correlated with GC activity increase and eventual OOM. Then check your
queries - are they big? Expensive aggregations? (the other day I saw one of
our clients using agg queries 10K lines in size) I could keep asking
questions..... share everything you've got to get help here.

Otis

Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/

On Thursday, November 20, 2014 3:53:24 AM UTC-5, tetlika wrote:

anyone?

Середа, 19 листопада 2014 р. 13:32:37 UTC+1 користувач Serg Fillipenko
написав:

We have contact profiles (20+ fields, containing nested documents)
indexed and their social profiles(10+ fields) indexed as child documents of
contact profile.
We run complex bool match queries, delete by query, delete children by
query, faceting queries on contact profiles.
index rate 14.31op/s
remove by query rate 13.41op/s (such high value caused by fact we
delete all child docs first before indexing of parent and then we index
children again)
search rate 2.53op/s
remove by ids 0.15op/s

We started to face this trouble under ES 1.2 but just after we started
to index and delete (no searching requests yet) child documents. On ES 1.4
we have the same issue.

What sort of data is it, what sort of queries are you running and how

often are they run?

On 19 November 2014 17:52, tetlika tet...@gmail.com wrote:

hi,

we have 6 servers and 14 shards in cluster, the index size 26GB, we
have 1 replica so total size is 52GB, and ES v1.4.0, java version "1.7.0_65"

we use servers with RAM of 14GB (m3.xlarge), and heap is set to 7GB

around week ago we started facing next issue:

random cluster servers around once per day/two are hitting the heap
size limit (java.lang.OutOfMemoryError: Java heap space) in log, and
cluster is failing - becomes red or yellow

we tried adding more servers to cluster - even 8, but than it's a
matter of time when we'll hit the problem, so looks no matter how many
servers are in cluster - it will still hit the limit after some time

before we started facing the problem we were running smoothly with 3
servers
also we set indices.fielddata.cache.size: 40% but it didnt helped

also, there are possible workarounds to decrease heap usage:

  1. reboot some server - than heap becomes under 70% and for some time
    cluster is ok

or

  1. decrease number of replicas to 0, and than back to 1

but I dont like to use those workarounds

how it can happen while all index can fit into RAM it can run out of
it?

thanks much for possible help

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2ae23017-fde7-4b10-b31b-39076b079f10%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/2ae23017-fde7-4b10-b31b-39076b079f10%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6cd54c22-26f7-40c0-95a9-ab960496e0f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.