I am trying to restore a 50GB elastic dump file into a new elasticsearch cluster running with 3 elasticsearch replicas on a kubernetes cluster.
The elasticsearch is running with a 7.10.2 oss version image and using managed-nfs-storage as storage class. The restoration started without any issue and I started seeing the indices are in yellow state at the very beginning.
Later when I try to check the state of the indices then the API call fails with client request timeout error continuously.
I don't see any errors from the elasticsearch pod logs but found out that there are some read write errors noticed from the nfs client deployed on the worker nodes.
NFS client errors observed on /var/log/messages file of the worker nodes:
Jun 21 14:57:48 cstream8-node kernel: NFS: __nfs4_reclaim_open_state: Lock reclaim failed!
Jun 21 14:57:53 cstream8-node kernel: __nfs4_reclaim_open_state: 603 callbacks suppressed
Jun 21 14:57:53 cstream8-node kernel: NFS: __nfs4_reclaim_open_state: Lock reclaim failed!
Read Write errors from nfsiostat command executed on worker nodes:
Is there any limitation on the datasize of the dump being restored?
--> not sure on the upper limit , but 50GB is really small data set and you should be able to restore easily, based on how powerful your data nodes are .
Is there any way to restore such a huge dump without hitting any such errors?
--> Are you trying to query the cluster where the restore is in progress?
Can you provide more context on what you are trying to restore and how? It is not clear.
Did you create a dump file using elasticdump or a similar tool? If yes, this is not an officially supported mode of restoring data to a Elasticsearch cluster, the official and recommended way to restore data into a cluster is using the Snapshot and Restore APIs.
It supports, but it is not recommended to use NFS based for data paths, only for snapshots repositories.
Need more context, as it is not clear what you are trying to restore and if this is a Elasticsearch issue or an infrastructure issue.
not sure on the upper limit , but 50GB is really small data set and you should be able to restore easily, based on how powerful your data nodes are .
-> can you please tell me how to check if the data nodes are really capable of handling this request?
Are you trying to query the cluster where the restore is in progress?
-> Yes I am trying to list the indices while the restoration is in progress because the restoration is taking a very long time.
Did you create a dump file using elasticdump or a similar tool?
-> No, I am not using any tools for dump and restore instead I have used elastic APIs in order to take dump and restore the elastic dump,
It supports, but it is not recommended to use NFS based for data paths, only for snapshots repositories.
-> Actually I am using NFS storage for storing both snapshots and data paths of each elastic replica.
Need more context, as it is not clear what you are trying to restore and if this is a Elasticsearch issue or an infrastructure issue.
-> Even I too suspect the NFS storage since there are some read write errors appearing in the nfs statistics.
This version is very old, long past EOL. You must upgrade as a matter of urgency.
This error is not coming from Elasticsearch, it's a timeout imposed by something else. It does not mean that the restore has failed. I don't remember if anything changed in the few years since 7.10 was released, but in supported versions the restore will carry on in the background and eventually complete successfully.
Elasticsearch requires the filesystem to act as if it were backed by a local disk, but this means that it will work correctly on properly-configured remote block devices (e.g. a SAN) and remote filesystems (e.g. NFS) as long as the remote storage behaves no differently from local storage. [...] The performance of an Elasticsearch cluster is often limited by the performance of the underlying storage, so you must ensure that your storage supports acceptable performance. Some remote storage performs very poorly, especially under the kind of load that Elasticsearch imposes, so make sure to benchmark your system carefully before committing to a particular storage architecture.
i.e. you're free to use NFS if you want, but it's on you to make sure it is properly configured and performs adequately. If it doesn't, you need someone with NFS expertise to help you (and this is probably not the best forum to find such help).
I have no further information beyond the docs link in my previous post. You will need to find some NFS experts to help you, and this isn't the right forum for that.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.