Shadow replicas indices

1222kkk · August 11, 2016, 7:12am

"If you would like to use a shared filesystem, you can use the shadow replicas settings to choose where on disk the data for an index should be kept, as well as how Elasticsearch should replay operations on all the replica shards of an index."

hi,
if I want to store one index in the specific datapath but not the default path , the shadow replicas indices may be the only solution?

tanguy · August 11, 2016, 7:35am

Hi,

Shadow replica allows this kind of configuration but be ware that this is an expert feature. Shadow replica shards are seen like real normal replica shards but under the hood there are some operation they just don't do, like replicating a document index operation, and they rely on the shared filesystem to sync shards files.

Why would you like to store an index on a specific path? Note that you can allocate specific indices on nodes, and then have nodes configured to use different data paths locally.

1222kkk · August 11, 2016, 7:47am

hi,
1.In fact, I want to store them in different disks. Important indices may be stored in the SSD, while the other may be stored in the HDD. So I want to specify the path in SSD when the index is created.
2.If when I used data.shared_path in th configure yml
and don't set the settings of the index : "shadow_replicas": true

may be this

    {
        "index" : {
            "number_of_shards" : 1,
            "number_of_replicas" : 4,
            "data_path": "/opt/data/my_index"
        }
    }

by setting this ,can i avoid using shadow replicas and using a specific path at the same time?

Sorry ,but I really don't know how to allocate specific indices on nodes

tanguy · August 11, 2016, 8:25am

Elasticsearch does not allow you to configure different data paths for indices. Only the shadow replica allow this, but it mean that you must use a shared filesystem with all your nodes accessing the same SSD disk.

I think you should allocate your important indices to the nodes that have SSDs and allocate your less important indices to nodes that have spinning disks for example.

You can do this using the Shard Allocation Filtering feature: https://www.elastic.co/guide/en/elasticsearch/reference/2.3/shard-allocation-filtering.html

1222kkk · August 11, 2016, 8:49am

Thank you, I got it. I just want to reach a compromise.
In fact, I have tried to set this in 3 nodes.
node0:

    path.data: /opt/elastic/..
    path.shared_data: /disk1/..

node1:

    path.data: /opt/elastic/..
    path.shared_data: /disk2/..

node2:

    path.data: /opt/elastic/..
    path.shared_data: /disk1/...

Then I create a new index

 POST /test
 {
        "index" : {
            "number_of_shards" : 1,
            "number_of_replicas" : 4,
            "data_path": "/1"
        }
    }

not set the shadow replicas indices :True. So it's default to be false. I think the filesystem may be used while the shadow replicas indices may not.

Then I find the data of the index allocate in the "path.shared_data" of these 3 nodes, just like a seperating data-stored-system comparing to the defalut path system.
I just wonder whether the replicas of the index "test " will be different from other indices stored in the "data.path".

"Shard Allocation Filtering" may not help me customize the datapath of a single nodes but the shards

tanguy · August 11, 2016, 10:08am

OK, sorry, it looks like I'm wrong - one can set a custom data_path for indices without using shadow replicas.

According to https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-shadow-replicas.html:

index.data_path (string)

Path to use for the index’s data. Note that by default Elasticsearch will
append the node ordinal by default to the path to ensure multiple instances
of Elasticsearch on the same machine do not share a data directory.

So in your case the "test" index will use a custom data path on the shared data path; each node will use the same shared filesystem but wil prefix the path using the node ordinal.

Yes

Topic		Replies	Views
Does shadow replica supports multiple shared file path for "data_path"? Elasticsearch	1	693	July 5, 2017
Shadow replica's issue Elasticsearch	1	373	July 5, 2017
Avoiding duplicate data and work when using a shared filesystem Elasticsearch	3	1249	July 6, 2017
Exception when re-balancing shadow replica index shards - unable to determine if root directory exists Elasticsearch	3	2080	July 5, 2017
Shared Data-Store for multiple Nodes Elasticsearch	3	1865	March 17, 2020

Shadow replicas indices

Related topics