Please provide suggestions on this


(banupriya) #1
     how data will be distributed when we have single index(1 shard and 1 replica)  stored into 2 nodes server. Primary will be stored into one node and replica will be stored in to another node. What will happen when disk space reached its max capacity  and adding new node to avoid failure in this scenario? Will new data for the index be loaded into newly added node or it cannot be loaded? If it can be loaded in to new node , will it not impact relevance score? 

 Can we move one index to other node if we have 2 index with above scenarios. ?
 During initial data load if we have 10 indexes, is there anywhere we can control to assign primary index in any specific node among multiple nodes. ?


(Mark Walkom) #2

Please format your question a little better, you have a single massive line that people need to scroll through just to see what you are asking.


(banupriya) #3

how data will be distributed when we have single index(1 shard and 1 replica) stored into 2 nodes server.
Primary will be stored into one node and replica will be stored in to another node.
What will happen when disk space reached its max capacity and adding new node to avoid failure in this scenario? Will new data for the index be loaded into newly added node or it cannot be loaded? If it can be loaded in to new node , will it not impact relevance score?
[/quote]


(Mark Walkom) #4

Yes.

Adding another node after a node gets to 100% disk use is no good. Do it before.


(banupriya) #5

will it not impact relevance score?


(Mark Walkom) #6

Scoring is calculated against docs in the same shard. So no.


(banupriya) #7

Can we move one index to other node if we have 2 index with above scenarios. ?


(Mark Walkom) #8

Maybe, you'd need to test really because ES doesn't generally like full disks.


(banupriya) #9

During initial data load if we have 10 indexes, is there anywhere we can control to assign primary index in any specific node among multiple nodes. ?


(Mark Walkom) #10

Nope. It shouldn't matter.


(banupriya) #11

then how to handle when max disk space reached?


(Mark Walkom) #12

Prevention is better than cure in this case.


(banupriya) #13

Hey mark i need more detailed description on this scenario i have 1shard 1 replica and 2nodes if i am indexing millions of documents when the disk reaches the max capacity ?whether we can increase node size or shards size which is recommended to increase ?


(Christian Dahlqvist) #14

If you have a single index with just 1 shard and 1 replica, there is little you can do around redistributing data if the disk gets full as shards can not be split. You can however provision nodes with more disk space and move the shard over there in order to scale up, but you will not be able to scale horizontally. If you however had more than 1 shard, you could add nodes to the cluster and scale out horizontally as Elasticsearch would relocate a portion of the shards on the node and free up space.


(banupriya) #15

Thanks christian for your suggestion i do have another query During initial data load if we have 10 indexes, is there anywhere we can control to assign primary index in any specific node among multiple nodes.


(Christian Dahlqvist) #16

I assume you are referring to primary shards as there are no primary indices, and if this is the case the answer is no, as this is not possible. Primaries and replicas however do the same amount of work, so you should not need to be concerned with this.


(banupriya) #17

if i have 5 nodes and 5th name is "testnode" and the index name is "testindex" is it possible to assign the "testindex" to "testnode" ?


(Christian Dahlqvist) #18

Yes, that should be possible through shard allocation awareness.


(banupriya) #19

Thanks for your suggestion christian :slight_smile:


(banupriya) #20

If we have 2 shard in one node then the data will be splited in each shard