Different shards land in same path.data directory

shadyabhi · April 13, 2016, 4:47pm

As I understand from the docs, data from a single shard goes to same path.data directory.

So, let's suppose, my path.data looks like ["/disk1", "/disk2"]. and I want maximum throughput for indexing. To achieve that, what I would ideally do is, assign 2 shards per node and expect ES to create these two shards on different path directories. However, I'm observing that both shards are created on "/disk1".

Why is that? Should this change? Is this an optimization that we can do in future versions?

Thanks in advance!

anhlqn · April 13, 2016, 6:30pm

Do the two paths have same amount of free disk space? If not, then ES allocate shards to the one with more free space first.

How big is each shard? If it's too small, ES may ignore it.

warkolm · April 13, 2016, 9:59pm

It doesn't do it to that level. It just makes sure a shard, irrespective of the index it belongs to, is wholly on a path.

It's an interesting idea though, why not raise a feature request for it on github?

Topic		Replies	Views
Shards distribution for multiple data paths Elasticsearch	1	1081	March 28, 2017
Shard distribution using multiple path.data locations Elasticsearch	2	1148	May 26, 2017
Multiple data directories ->parallel search of shards on same instance? Elasticsearch	6	3400	July 5, 2017
Multiple disks on a single node Elasticsearch	4	2721	July 5, 2017
How many disks are used by one shard? Elasticsearch	2	484	July 5, 2017

Different shards land in same path.data directory

Related topics