Unable to migrate indices from old 6.1.2 elasticsearch node to 7.8 elasticsearch node

cmiles · August 5, 2020, 10:07pm

Hello,

We have a relatively old ELK stack that we're attempting to migrate off. It's been neglected and left running to collect some logs from our F5 load balancers. I'm attempting to migrate the daily f5 indices to our new 7.8 ELK stack however the old 6.1.2 ELK is in a bad state and timing out on both reindex API and logstash elasticsearch input. I believe it has too many shards and the JVM garbage collector is causing it to crash whenever I initiate a migration.

How can I get the 6.1.2 ELK stack in a better state so I can initiate the index migration? Looking into reducing shards however I'm not sure if that'll fix it. Here is some information regarding the ES health:

    [root@prod-elastic tmp]#  curl -XGET '10.21.93.121:9200/_cluster/health?pretty'
{
  "cluster_name" : "elasticsearch",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 4221,
  "active_shards" : 4221,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 4220,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 50.00592346878332
}

I'm seeing the GC message every few seconds in the elasticsearch.log:

    [prod-elastic] [gc][923] overhead, spent [11.7s] collecting in the last [11.9s]

I can start a reindex API request from my 7.8 cluster, however that looks like it stops working ~halfway through. It also causes the 6.1.2 cluster to become unresponsive to any API queries.

Both of the ES nodes are a VM running on the same vcenter. Here's the stats for the 6.1.2 VM:
CPU: 128
Memory: 128
JVMHeap: 12gb

Any suggestions on what I can do to migrate the indices off this cluster? We have around 3 years of historical data we'd like to keep. I can provide more details as requested.
Thanks!

Steve_Mushero · August 6, 2020, 6:20am

Well, you have a lot of shards and a smallish heap - step one is raising heap to 32GB or so to stop GC and other issues - you have 128GB of RAM - even though it's slower, you can also go much higher if it keeps you from crashing (like 64GB but should not be needed; 32GB is a lot for just metadata).

If that's not stable, close the older indices you don't need, then reindex/migrate the rest and then close those, open others, etc. in a rolling manner to keep index/shard count under control.

cmiles · August 26, 2020, 11:13pm

Thanks, bumping the JVM heap up temporarily allowed us to migrate the indices off. We were able to consolidate them to reduce shard count.

system · September 23, 2020, 11:13pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Can't move ES indexes to another disk Elasticsearch	3	479	December 21, 2020
Migrate indices from elasticsearsh 6.8 to 7.17 Elasticsearch	3	637	March 16, 2023
ElasticSearch timeouts after upgrade to 6.5.1 Elasticsearch	19	1636	January 11, 2019
Migration of Elasticsearch indices (58 GB) from 1.x to 2.x taking huge time Elasticsearch	3	470	February 28, 2017
Need help: Can't migrate indices between clusters :( Elasticsearch	3	321	July 6, 2017

Unable to migrate indices from old 6.1.2 elasticsearch node to 7.8 elasticsearch node

Related topics