Usage of GPFS file system on 10 node cluster

es1024 · February 1, 2016, 8:40pm

We are setting up a 10 node ES cluster for one of our clients and will be loading/querying ES data in the terabytes. The current system was configured with a GPFS file system. I have some concerns with this related to network traffic. Does anyone have any experience supporting very large clusters running on GPFS or shared file systems. Is this a good idea? We do not have local storage available? Are there any other options? NFS/ etc... SOrry, I'm not a storage guy but would appreciate any help!

Thanks

warkolm · February 1, 2016, 8:43pm

GPFS - Wikipedia?

If so I wouldn't, running a distributed system on a distributed FS is asking for slowness.

es1024 · February 1, 2016, 9:07pm

I'm not sure if we're going to have a choice about the file system. Although I'd like to tell my customer why it's not a good choice.

If we go this route are there any configurations or anything that would help with performance.

warkolm · February 2, 2016, 3:05am

You have to wait for a query to go from node 1 to node N in your cluster. That node N then needs to go to location A on the clustered filesystem to collect the data to bring back for whatever work you need to do.

Or if you lose a node then ES will try to reallocate, which will also impact the FS as it realises that a) it has lost some part of the overall store, then b) as it deals with ES reallocating (ie lots of IO) and also c) rebalancing itself.

Christian_Dahlqvist · February 2, 2016, 6:49am

You may want to consider or test shadow replica indices if you encounter problems with your storage. It is however worth noting that this feature is marked as experimental.

Topic		Replies	Views
My node crawls when I have Elasticsearch store its data on GPFS Elasticsearch	2	551	July 5, 2017
Need advice about configuring a cluster Elasticsearch	9	433	July 6, 2017
Single NFS Storage for Entire Cluster - Separate processing and data replication Elasticsearch	2	4306	July 6, 2017
Elasticsearch shared storage advice Elasticsearch	3	1783	January 11, 2017
Elasticsearch on google cloud storage Elasticsearch	2	712	August 30, 2017

Usage of GPFS file system on 10 node cluster

Related topics