Need some help in understanding Why retrieving all snapshots is taking forever

Hi all,
We are using the elasticsearch for storing the logs of our system via the filebeat and our cluster consists of the following on 6.8 Elasticsearch Cluster

  "cluster_name" : "elk",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 24,
  "number_of_data_nodes" : 21,
  "active_primary_shards" : 4284,
  "active_shards" : 8568,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

I have setup the snapshot repository to S3. Have a python cronjob that takes the snapshot of indices older than 30 days to s3 and then deletes it on the cluster. Lately, when I go to get the list of snapshot via the command
curl http://localhost:9200/_cat/snapshots/repo_name?pretty
It takes forever to get the result (>10m) and in most of the time, I just give up and my cron-job is failing because it timeouts while its calls to get the status of the snapshot. Any suggestions or help is well appreciated. Thanks

Elasticsearch will reply eventually, but with large snapshot repositories it may indeed take minutes (sometimes hours) to list them all. Especially on such an old version - 6.8 hasn't seen any enhancements for over 5 years now and went EOL ages ago. You need to upgrade as a matter of urgency.

1 Like

Thanks a lot David, We have plans to update. I was wondering if there is anything I can do currently to mitigate the latency of retrieving the snapshots. Perhaps only way is to upgrade.

I don't remember any workarounds but it's been so long since I've even looked at the 6.8 code, sorry.