I've tried doing a restore for Elasticsearch and used the _cat/recovery API to monitor the time the snapshot was restored in the cluster but upon checking, restoring a snapshot is per shard basis. For example in an index, there are 5 shards so each contains a specific time as to when it was restored. May I know if these shards are restored in parallel with each other or are sequential like the first shard then the second shard will follow then vice versa?
It depends on several factor, including number of shards, their allocation (same node or different nodes), as well as settings for per node concurrent recoveries, which in turn depend on the version of elasticsearch we are talking about. In other words, it's parallel to a certain degree, but not to the level where too many concurrent recoveries would cause a significant degradation of the cluster.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.