Elasticsearch:java.lang.OutOfMemoryError: Java heap space


(Steven) #1

When I query data in past 7 days in Kibana (search on 1 dashboard contains 9 visualization panels, including aggregations, and descending size are about 10~20) I got these error:

    [2017-09-21T03:08:24,180][WARN ][o.e.m.j.JvmGcMonitorService] [master-2] [gc][382351] overhead, spent [777ms] collecting in the last [1s]
    [2017-09-21T03:08:26,324][WARN ][o.e.m.j.JvmGcMonitorService] [master-2] [gc][382352] overhead, spent [2s] collecting in the last [2.1s]
    [2017-09-21T03:08:27,381][WARN ][o.e.m.j.JvmGcMonitorService] [master-2] [gc][382353] overhead, spent [809ms] collecting in the last [1s]
    [2017-09-21T03:08:27,566][WARN ][o.e.i.b.request          ] [request] New used memory 7480688384 [6.9gb] for data of [<reused_arrays>] would be larger t
    han configured breaker: 6400612761 [5.9gb], breaking
    [2017-09-21T03:08:28,381][INFO ][o.e.m.j.JvmGcMonitorService] [master-2] [gc][382354] overhead, spent [307ms] collecting in the last [1s]
    [2017-09-21T03:10:21,764][WARN ][o.e.m.j.JvmGcMonitorService] [master-2] [gc][382466] overhead, spent [2s] collecting in the last [2.3s]
    [2017-09-21T03:10:22,811][WARN ][o.e.m.j.JvmGcMonitorService] [master-2] [gc][382467] overhead, spent [681ms] collecting in the last [1s]
    [2017-09-21T03:10:23,812][INFO ][o.e.m.j.JvmGcMonitorService] [master-2] [gc][382468] overhead, spent [385ms] collecting in the last [1s]
    [2017-09-21T03:10:49,816][WARN ][o.e.m.j.JvmGcMonitorService] [master-2] [gc][382494] overhead, spent [559ms] collecting in the last [1s]
    [2017-09-21T03:10:51,542][WARN ][o.e.m.j.JvmGcMonitorService] [master-2] [gc][382495] overhead, spent [1.6s] collecting in the last [1.7s]
    [2017-09-21T03:10:52,720][WARN ][o.e.m.j.JvmGcMonitorService] [master-2] [gc][382496] overhead, spent [942ms] collecting in the last [1.1s]
    [2017-09-21T03:10:57,807][WARN ][o.e.m.j.JvmGcMonitorService] [master-2] [gc][382497] overhead, spent [2.9s] collecting in the last [3s]
    java.lang.OutOfMemoryError: Java heap space
  Dumping heap to java_pid21870.hprof ...
    Heap dump file created [10780134241 bytes in 61.448 secs]
    [2017-09-21T03:12:15,804][ERROR][o.e.x.m.c.i.IndexStatsCollector] [master-2] collector [index-stats] timed out when collecting data
    [2017-09-21T03:12:16,026][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [master-2] fatal error in thread [elasticsearch[master-2][search][T#5]], exiting
    java.lang.OutOfMemoryError: Java heap space
            at org.elasticsearch.common.util.PageCacheRecycler$1.newInstance(PageCacheRecycler.java:99) ~[elasticsearch-5.4.0.jar:5.4.0]
            at org.elasticsearch.common.util.PageCacheRecycler$1.newInstance(PageCacheRecycler.java:96) ~[elasticsearch-5.4.0.jar:5.4.0]
            at org.elasticsearch.common.recycler.DequeRecycler.obtain(DequeRecycler.java:53) ~[elasticsearch-5.4.0.jar:5.4.0]
            at org.elasticsearch.common.recycler.AbstractRecycler.obtain(AbstractRecycler.java:33) ~[elasticsearch-5.4.0.jar:5.4.0]
            at org.elasticsearch.common.recycler.DequeRecycler.obtain(DequeRecycler.java:28) ~[elasticsearch-5.4.0.jar:5.4.0]
            at org.elasticsearch.common.recycler.FilterRecycler.obtain(FilterRecycler.java:39) ~[elasticsearch-5.4.0.jar:5.4.0]
            at org.elasticsearch.common.recycler.Recyclers$3.obtain(Recyclers.java:119) ~[elasticsearch-5.4.0.jar:5.4.0]
            at org.elasticsearch.common.recycler.FilterRecycler.obtain(FilterRecycler.java:39) ~[elasticsearch-5.4.0.jar:5.4.0]
            at org.elasticsearch.common.util.PageCacheRecycler.bytePage(PageCacheRecycler.java:147) ~[elasticsearch-5.4.0.jar:5.4.0]
            at org.elasticsearch.common.util.AbstractBigArray.newBytePage(AbstractBigArray.java:112) ~[elasticsearch-5.4.0.jar:5.4.0]
            at org.elasticsearch.common.util.BigByteArray.<init>(BigByteArray.java:44) ~[elasticsearch-5.4.0.jar:5.4.0]
            at org.elasticsearch.common.util.BigArrays.newByteArray(BigArrays.java:464) ~[elasticsearch-5.4.0.jar:5.4.0]
            at org.elasticsearch.common.util.BigArrays.resize(BigArrays.java:488) ~[elasticsearch-5.4.0.jar:5.4.0]
            at org.elasticsearch.common.util.BigArrays.grow(BigArrays.java:502) ~[elasticsearch-5.4.0.jar:5.4.0]
            at org.elasticsearch.search.aggregations.metrics.cardinality.HyperLogLogPlusPlus.ensureCapacity(HyperLogLogPlusPlus.java:197) ~[elasticsearch-5.4.0.jar:5.4.0]
            at org.elasticsearch.search.aggregations.metrics.cardinality.HyperLogLogPlusPlus.collect(HyperLogLogPlusPlus.java:232) ~[elasticsearch-5.4.0.jar:5.4.0]
            at org.elasticsearch.search.aggregations.metrics.cardinality.CardinalityAggregator$OrdinalsCollector.postCollect(CardinalityAggregator.java:280) ~[elasticsearch-5.4.0.jar:5.4.0]
            at org.elasticsearch.search.aggregations.metrics.cardinality.CardinalityAggregator.postCollectLastCollector(CardinalityAggregator.java:120) ~[elasticsearch-5.4.0.jar:5.4.0]
            at ...

After that some of my data nodes throw out 'java.lang.OutOfMemoryError: Java heap space' then lost service.
There are service logs in my ES and index splited by day .
Mem: total 31G
jvm.options:
-Xms16g
-Xmx16g
And 10 data nodes and 25 indices,164 shards.

Anyone helps?


(Christian Dahlqvist) #2

What kind of visualisations and aggregation types do you have on the dashboard? Are any of these configured in a way that would make them generate a lot of buckets?


(Steven) #3

The visualisations are like this,and there are about 9 visualisations the same as below:


(Christian Dahlqvist) #4

Based on the stack trace it looks like you have a cardinality aggregation. How is this configured? What is the output of the cluster stats API?


(Steven) #5

cluster stats API output like this:

    {
      "_nodes": {
        "total": 11,
        "successful": 11,
        "failed": 0
      },
      "cluster_name": "cluster-es",
      "timestamp": 1505987366333,
      "status": "green",
      "indices": {
        "count": 25,
        "shards": {
          "total": 164,
          "primaries": 73,
          "replication": 1.2465753424657535,
          "index": {
            "shards": {
              "min": 2,
              "max": 10,
              "avg": 6.56
            },
            "primaries": {
              "min": 1,
              "max": 5,
              "avg": 2.92
            },
            "replication": {
              "min": 1,
              "max": 2,
              "avg": 1.24
            }
          }
        },
        "docs": {
          "count": 98633103,
          "deleted": 782416
        },
        "store": {
          "size": "207.6gb",
          "size_in_bytes": 222970379288,
          "throttle_time": "0s",
          "throttle_time_in_millis": 0
        },
        "fielddata": {
          "memory_size": "117.9mb",
          "memory_size_in_bytes": 123664240,
          "evictions": 0
        },
        "query_cache": {
          "memory_size": "173.1mb",
          "memory_size_in_bytes": 181511864,
          "total_count": 1771845,
          "hit_count": 803967,
          "miss_count": 967878,
          "cache_size": 20000,
          "cache_count": 47599,
          "evictions": 27599
        },
        "completion": {
          "size": "0b",
          "size_in_bytes": 0
        },
        "segments": {
          "count": 2203,
          "memory": "510.3mb",
          "memory_in_bytes": 535147110,
          "terms_memory": "394.6mb",
          "terms_memory_in_bytes": 413871655,
          "stored_fields_memory": "26.8mb",
          "stored_fields_memory_in_bytes": 28119184,
          "term_vectors_memory": "0b",
          "term_vectors_memory_in_bytes": 0,
          "norms_memory": "17.5mb",
          "norms_memory_in_bytes": 18393216,
          "points_memory": "5.7mb",
          "points_memory_in_bytes": 6074275,
          "doc_values_memory": "65.5mb",
          "doc_values_memory_in_bytes": 68688780,
          "index_writer_memory": "48.7mb",
          "index_writer_memory_in_bytes": 51169715,
          "version_map_memory": "189.2kb",
          "version_map_memory_in_bytes": 193810,
          "fixed_bit_set": "0b",
          "fixed_bit_set_memory_in_bytes": 0,
          "max_unsafe_auto_id_timestamp": 1505984088388,
          "file_sizes": {}
        }
      },
      "nodes": {
        "count": {
          "total": 11,
          "data": 9,
          "coordinating_only": 0,
          "master": 2,
          "ingest": 11
        },
        "versions": [
          "5.4.0"
        ],
        "os": {
          "available_processors": 88,
          "allocated_processors": 88,
          "names": [
            {
              "name": "Linux",
              "count": 11
            }
          ],
          "mem": {
            "total": "345.6gb",
            "total_in_bytes": 371108069376,
            "free": "26.2gb",
            "free_in_bytes": 28156194816,
            "used": "319.3gb",
            "used_in_bytes": 342951874560,
            "free_percent": 8,
            "used_percent": 92
          }
        },
        "process": {
          "cpu": {
            "percent": 6
          },
          "open_file_descriptors": {
            "min": 514,
            "max": 573,
            "avg": 552
          }
        },
        "jvm": {
          "max_uptime": "4.7d",
          "max_uptime_in_millis": 406926141,
          "versions": [
            {
              "version": "1.8.0_131",
              "vm_name": "OpenJDK 64-Bit Server VM",
              "vm_version": "25.131-b11",
              "vm_vendor": "Oracle Corporation",
              "count": 11
            }
          ],
          "mem": {
            "heap_used": "68.5gb",
            "heap_used_in_bytes": 73629830552,
            "heap_max": "167.2gb",
            "heap_max_in_bytes": 179621593088
          },
          "threads": 1051
        },
        "fs": {
          "total": "10tb",
          "total_in_bytes": 11095911620608,
          "free": "9.8tb",
          "free_in_bytes": 10842620862464,
          "available": "9.3tb",
          "available_in_bytes": 10332199518208
        },
        "network_types": {
          "transport_types": {
            "netty4": 11
          },
          "http_types": {
            "netty4": 11
          }
        }
      }
    }

(Steven) #6

Any suggestions?


(Steven) #7

Solved by remove aggregations contains lots of buckets.


(Christian Dahlqvist) #8

What type of aggregation did you have that created lots of buckets? How was it configured?

The reason I am asking is that it may be useful for other users as a reference and example.


(Steven) #9

1.get unique count
2.search each unique key in logs and get another unique key
3.order by step one's count

Important: when you want to get the aggregations order by metric, you must do it in Bar charts.


(system) #10

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.