Elasticsearch support for Distributed FS like Rook-ceph

chaitra_hegde · September 13, 2019, 10:25am

Hi,
Does Elasticsearch architecture supports Distributed File system for storage like rook-ceph?

DavidTurner · September 13, 2019, 12:02pm

No, that's not something that is supported or tested.

chaitra_hegde · September 19, 2019, 12:23pm

Hi,
Is there any particular reason why distributed storage is not supported?
Is there any plan to support it?

DavidTurner · September 19, 2019, 3:30pm

It seems somewhat unnecessary since Elasticsearch already has distributed features built in. For instance, it natively scales out across multiple nodes, replicates its data, and automatically recovers from partial failures.

Remote storage also tends to have much higher latency than local disks, and this can have a big effect on performance.

Elasticsearch presents a particularly stressful workload to the filesystem and turns out to be rather good at hitting filesystem bugs and corner cases that other tests might have missed. This is true even of very mature and well-established filesystems. Distributed filesystems have seen much less production use and therefore seem a much riskier choice. See for instance this post with some links to recent glusterfs bugs:

Daniel_Penning · September 19, 2019, 3:56pm

If you intend on using ceph object storage or cephfs instead which are both built for shared access from many machines in parallel you will have bad performance and also run a high risk of data corruption.

But with traditional file system like ext4/xfs/ntfs running on ceph-rdb elasticsearch should run without problems. This or similar setups are used by many virtual machine hosters. The important part here is that access to a ceph block device is exclusive to one machine and uses a file system that elasticsearch had been tested with. We have multiple elasticsearch clusters running on a setup like that without problems for several years now without problems.

chaitra_hegde · September 24, 2019, 11:52am

Hi,
I am using Elasticsearch in kubernates environment with rook-ceph block storage and I am facing data corruption frequently.
Error log looks like this:

org.elasticsearch.bootstrap.StartupException: ElasticsearchException[java.io.IOException: failed to read [id:0, file:/data/data/nodes/0/_state/node-0.st]]; nested: IOException[failed to read [id:0, file:/data/data/nodes/0/_state/node-0.st]]; nested: CorruptStateException[org.apache.lucene.index.CorruptIndexException: codec footer mismatch (file truncated?): actual footer=-637534208 vs expected footer=-1071082520 (resource=BufferedChecksumIndexInput(SimpleFSIndexInput(path="/data/data/nodes/0/_state/node-0.st")))]; nested: CorruptIndexException[codec footer mismatch (file truncated?): actual footer=-637534208 vs expected footer=-1071082520 (resource=BufferedChecksumIndexInput(SimpleFSIndexInput(path="/data/data/nodes/0/_state/node-0.st")))];
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:140) ~[elasticsearch-6.5.4.jar:6.5.4]
at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:127) ~[elasticsearch-6.5.4.jar:6.5.4]
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) ~[elasticsearch-6.5.4.jar:6.5.4]

DavidTurner · September 24, 2019, 12:13pm

Right, @chaitra_hegde, that's what I mean. This isn't a supported or tested configuration and this sort of problem isn't really surprising to me. The solution is not to use a distributed filesystem.

system · October 22, 2019, 12:13pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
CEPH Storage For Elasticsearch Elasticsearch	5	3524	September 8, 2021
Rook ceph for Elastic Search Elasticsearch	2	1028	July 9, 2021
Elasticsearch support for ceph storage? Elasticsearch	3	517	October 22, 2020
Ceph as elasticserach backend Elasticsearch	5	2332	January 12, 2017
ElastiSearch Supports GlusterFS and Rook Storage System? Elasticsearch	2	860	August 29, 2019

Elasticsearch support for Distributed FS like Rook-ceph

Related topics