JVM > 90% - Small indexes , High Shards

mileshigh · September 25, 2015, 2:08pm

Hello,

My team is having some issues with our ElasticSearch cluster regarding the JVM and its need to consume more and more JVM heap. Our issue comes from the need to provide even more RAM for such a simple use case (which can be read below). Would it be wise to scale out to more instance to keep up with the reads? Or is there an issue we have missed?

Use case: We batch load a new index hourly to be read by our front end unit. The index is 20MB in size and will be read 3 to 4 thousand times per second.

Cluster size:
Nodes: 13 Indices: 1293 Shards: 12728 Data: 199.52 GB CPU: 2423% Memory: 308.73 GB / 374.25 GB
3 master nodes - 32GB RAM , 4 Cores
10 slave nodes - 64GB RAM , 8 Cores

      "jvm" : {
    "timestamp" : 1443189668624,
    "uptime_in_millis" : 169701260,
    "mem" : {
      "heap_used_in_bytes" : 30704178392,
      "heap_used_percent" : 89,
      "heap_committed_in_bytes" : 34290008064,
      "heap_max_in_bytes" : 34290008064,
      "non_heap_used_in_bytes" : 126976704,
      "non_heap_committed_in_bytes" : 129069056,
      "pools" : {
        "young" : {
          "used_in_bytes" : 189203944,
          "max_in_bytes" : 558432256,
          "peak_used_in_bytes" : 558432256,
          "peak_max_in_bytes" : 558432256
        },
        "survivor" : {
          "used_in_bytes" : 4907360,
          "max_in_bytes" : 69730304,
          "peak_used_in_bytes" : 69730304,
          "peak_max_in_bytes" : 69730304
        },
        "old" : {
          "used_in_bytes" : 30510067088,
          "max_in_bytes" : 33661845504,
          "peak_used_in_bytes" : 30778888464,
          "peak_max_in_bytes" : 33661845504
        }
      }
    },
    "threads" : {
      "count" : 103,
      "peak_count" : 129
    },
    "gc" : {
      "collectors" : {
        "young" : {
          "collection_count" : 11403,
          "collection_time_in_millis" : 640445
        },
        "old" : {
          "collection_count" : 3068,
          "collection_time_in_millis" : 317938
        }
      }
    },

warkolm · September 25, 2015, 11:35pm

What sort of data is it, are you using parent/child or nesting, what sort of queries do you run?

mileshigh · September 28, 2015, 4:26pm

Simple data set. Four columns id, date, url, score.

We query based on date, url, and score and have no parent child relationships.

We continue to see high jvm heap and are looking into persistent connection or possible overhead from each read.

front end unit > nginx proxy > elasticsearch

mileshigh · September 30, 2015, 6:18pm

I have done some more digging around to find the problem. I think the high jvm has to do with the amount of persistent connections we have on our system.

I added in the keepalive parameter to nginx. And from the linux OS I see only 450 connections instead of the 4,500 prior. However, I continue to see the "http" - "total_connections" rise.

"total_opened" : 7452541
"total_opened" : 1770832
"total_opened" : 1770971
"total_opened" : 1770846
"total_opened" : 1735536
"total_opened" : 1770768
"total_opened" : 1770788

Even went as far as to shut off Nginx for 15 minutes. However the connections continued to rise. Maybe someone can provide some insight as to why?

warkolm · September 30, 2015, 6:28pm

Opened is a cumulative count rather than a current active one.

mileshigh · October 6, 2015, 5:46pm

I discovered the issue with our setup. Each shard is a lucene index and lucene has overhead when loading into RAM, 20 to 30%.

Since we are creating a new index every hour we are bogging down our memory with lucene index for each shard we created. 10 shards with 3 replicas = 30 new shards all with its own over head.

Just in case anyone runs into this issue in the future.

Topic		Replies	Views
Optimize elasticsearch / JVM Elasticsearch	6	1332	July 6, 2017
Elasticsearch high load/CPU usage Elasticsearch	10	9578	July 6, 2017
High node JVM heap cause ES cluster almost stop working Elasticsearch	6	1774	February 24, 2019
How do I check what causing high JVM? Elasticsearch	5	1452	March 15, 2021
Memory requirements and settings Elasticsearch	8	3026	July 6, 2017

JVM > 90% - Small indexes , High Shards

Related topics