Gc takes a lot of time

we use elasticsearch to store log data.
Currently we have 16 data nodes and 3 master.
Each data node is 24 core cpu, 2.6Tx12 disk, and 32G memory.
We use half of the memory for elasticsearch.
We use index name like bussness-2018.01.01. When the shards number grow to 27000, the gc begins like

    [2018-01-08T13:32:13,437][WARN ][o.e.m.j.JvmGcMonitorService] [es-data-m-2] [gc][1295560] overhead, spent [15.4s] collecting in the last [16.2s]
[2018-01-08T13:32:53,044][INFO ][o.e.m.j.JvmGcMonitorService] [es-data-m-2] [gc][old][1295585][122299] duration [15.3s], collections [2]/[15.4s], total [15.3s]/[12.2h], memory [13.7gb]->[12.4gb]/[13.8gb], all_pools {[young] [1.1gb]->[2.4mb]/[1.1gb]}{[survivor] [149.7mb]->[0b]/[149.7mb]}{[old] [12.4gb]->[12.4gb]/[12.5gb]}
[2018-01-08T13:32:53,044][WARN ][o.e.m.j.JvmGcMonitorService] [es-data-m-2] [gc][1295585] overhead, spent [15.3s] collecting in the last [15.4s]
[2018-01-08T13:33:16,008][WARN ][o.e.m.j.JvmGcMonitorService] [es-data-m-2] [gc][old][1295593][122300] duration [14.4s], collections [1]/[15.9s], total [14.4s]/[12.2h], memory [13.6gb]->[12.7gb]/[13.8gb], all_pools {[young] [1gb]->[253.4mb]/[1.1gb]}{[survivor] [131.8mb]->[0b]/[149.7mb]}{[old] [12.4gb]->[12.4gb]/[12.5gb]}
[2018-01-08T13:33:16,023][WARN ][o.e.m.j.JvmGcMonitorService] [es-data-m-2] [gc][1295593] overhead, spent [14.9s] collecting in the last [15.9s]
[2018-01-08T13:33:33,060][WARN ][o.e.m.j.JvmGcMonitorService] [es-data-m-2] [gc][old][1295596][122301] duration [14.7s], collections [1]/[15s], total [14.7s]/[12.2h], memory [13.4gb]->[12.2gb]/[13.8gb], all_pools {[young] [859.2mb]->[2.5mb]/[1.1gb]}{[survivor] [113.3mb]->[0b]/[149.7mb]}{[old] [12.5gb]->[12.1gb]/[12.5gb]}
[2018-01-08T13:33:33,060][WARN ][o.e.m.j.JvmGcMonitorService] [es-data-m-2] [gc][1295596] overhead, spent [14.7s] collecting in the last [15s]
[2018-01-08T13:35:32,803][WARN ][o.e.m.j.JvmGcMonitorService] [es-data-m-2] [gc][old][1295700][122316] duration [15.8s], collections [1]/[16.2s], total [15.8s]/[12.2h], memory [13.4gb]->[12.5gb]/[13.8gb], all_pools {[young] [985.1mb]->[163.6mb]/[1.1gb]}{[survivor] [125mb]->[0b]/[149.7mb]}{[old] [12.4gb]->[12.4gb]/[12.5gb]}
[2018-01-08T13:35:32,804][WARN ][o.e.m.j.JvmGcMonitorService] [es-data-m-2] [gc][1295700] overhead, spent [15.8s] collecting in the last [16.2s]
[2018-01-08T13:35:49,828][WARN ][o.e.m.j.JvmGcMonitorService] [es-data-m-2] [gc][old][1295703][122317] duration [14.7s], collections [1]/[15s], total [14.7s]/[12.2h], memory [13gb]->[12.1gb]/[13.8gb], all_pools {[young] [448.8mb]->[27mb]/[1.1gb]}{[survivor] [116.6mb]->[0b]/[149.7mb]}{[old] [12.4gb]->[12.1gb]/[12.5gb]}

Then we delete some indices, it's 17000, everything is ok.
But two days later, the same problem occurs.
We have to merge some small indices into a big one. Then we down the number of shards to 9000.
It went well for about 3 days. gc problems occurs again.
We delete some indices to make it 7000, Now it's ok.
How should we improve the gc performace ? We cann't continue to delete the indices.
And this is the jvm of a new restart data node.

{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "myesdb",
  "nodes" : {
    "pw4f0oWfT9KGSGi84AKm4w" : {
      "timestamp" : 1515421426980,
      "name" : "es-data-m-9",
      "transport_address" : "10.83.56.10:9300",
      "host" : "10.83.56.10",
      "ip" : "10.83.56.10:9300",
      "roles" : [
        "data",
        "ingest"
      ],
      "jvm" : {
        "timestamp" : 1515421426981,
        "uptime_in_millis" : 11791868,
        "mem" : {
          "heap_used_in_bytes" : 10351075064,
          "heap_used_percent" : 69,
          "heap_committed_in_bytes" : 14875361280,
          "heap_max_in_bytes" : 14875361280,
          "non_heap_used_in_bytes" : 145224888,
          "non_heap_committed_in_bytes" : 152145920,
          "pools" : {
            "young" : {
              "used_in_bytes" : 988229576,
              "max_in_bytes" : 1256259584,
              "peak_used_in_bytes" : 1256259584,
              "peak_max_in_bytes" : 1256259584
            },
            "survivor" : {
              "used_in_bytes" : 154718320,
              "max_in_bytes" : 157024256,
              "peak_used_in_bytes" : 157024256,
              "peak_max_in_bytes" : 157024256
            },
            "old" : {
              "used_in_bytes" : 9208127168,
              "max_in_bytes" : 13462077440,
              "peak_used_in_bytes" : 13088728336,
              "peak_max_in_bytes" : 13462077440
            }
          }
        },
        "threads" : {
          "count" : 272,
          "peak_count" : 314
        },
        "gc" : {
          "collectors" : {
            "young" : {
              "collection_count" : 2845,
              "collection_time_in_millis" : 168440
            },
            "old" : {
              "collection_count" : 981,
              "collection_time_in_millis" : 202767
            }
          }
        },
        "buffer_pools" : {
          "direct" : {
            "count" : 274,
            "used_in_bytes" : 815798639,
            "total_capacity_in_bytes" : 815798638
          },
          "mapped" : {
            "count" : 22171,
            "used_in_bytes" : 1962403489862,
            "total_capacity_in_bytes" : 1962403489862
          }
        },
        "classes" : {
          "current_loaded_count" : 12300,
          "total_loaded_count" : 12511,
          "total_unloaded_count" : 211
        }
      }
    }
  }
}

27K shards sounds like a lot of shards (almost 1.5K per data node). Maybe you have too many shards given the size of your data.

  1. How many indices do you have?
  2. How much data are you storing in each of your daily indices?

Have a look at this blog post for some general guidelines.

1 Like

Now we have 7000 shards, about 368 indices. We have 2.5T store size everyday. And we use es 5.5.1

What is your average shard size? How much heap have you got configured for each data node? Are you forcemerging indices no longer being written to (assuming you have time-based indices) in order to reduce the number of segments?

the size of shard vary from 2G to 400G. It depends on the log data of that service. most of them are less than 100G. I use reindex to merge the small ones into a big one then delete the small ones. I use -Xms14G -Xmx14G to start the elasticsearch.

If you are using time-based indices, you might want to look into using the rollover index API to create new indices only when they reach a particular age and/or size. This can save from having to reindex a lot of data and still allow you to handle peaky traffic without certain indices/shards getting too large. This is also described in this blog post.

100G per shard sounds way too big. A size between 20GB-50GB per shard is much more reasonable. So you have a mix between too many shards that are too small (lots of 2GB shards) and also too many that are too big (lots of 100GB+ shards).

Indices with lots of data should probably be configured with more primary shards, while indices getting less data should be configured with less primary shards.

Also look into the rollover index API like Christian suggested.

Thanks a lot, it really helps. I will do some research on these articles. And one more question, we use filebeat to write data directly to es. does filebeat support rollover index api ??

The Rollover index API is something you configure into ES directly and whenever an index grows too big (or gets too old), it gets rolled over. Filebeat has no notion whatsoever of that feature, it's transparent for it.

Oh, I see, I just saw the api. Thanks a lot, I will close the issue for now. Again, thank you very much, guys.

If you are not already, you can use curator to manage the rollover process.

It looks like you are using ingest nodes. If so, I would generally also recommend using dedicated nodes for this.

Ok, cool, thanks again. I will try curator, and use ingest nodes as dedicated too. Yeah, so happy to known how to do. :rofl:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.