how data will be distributed when we have single index(1 shard and 1 replica) stored into 2 nodes server. Primary will be stored into one node and replica will be stored in to another node. What will happen when disk space reached its max capacity and adding new node to avoid failure in this scenario? Will new data for the index be loaded into newly added node or it cannot be loaded? If it can be loaded in to new node , will it not impact relevance score?
Can we move one index to other node if we have 2 index with above scenarios. ?
During initial data load if we have 10 indexes, is there anywhere we can control to assign primary index in any specific node among multiple nodes. ?
how data will be distributed when we have single index(1 shard and 1 replica) stored into 2 nodes server.
Primary will be stored into one node and replica will be stored in to another node.
What will happen when disk space reached its max capacity and adding new node to avoid failure in this scenario? Will new data for the index be loaded into newly added node or it cannot be loaded? If it can be loaded in to new node , will it not impact relevance score?
[/quote]
Hey mark i need more detailed description on this scenario i have 1shard 1 replica and 2nodes if i am indexing millions of documents when the disk reaches the max capacity ?whether we can increase node size or shards size which is recommended to increase ?
If you have a single index with just 1 shard and 1 replica, there is little you can do around redistributing data if the disk gets full as shards can not be split. You can however provision nodes with more disk space and move the shard over there in order to scale up, but you will not be able to scale horizontally. If you however had more than 1 shard, you could add nodes to the cluster and scale out horizontally as Elasticsearch would relocate a portion of the shards on the node and free up space.
Thanks christian for your suggestion i do have another query During initial data load if we have 10 indexes, is there anywhere we can control to assign primary index in any specific node among multiple nodes.
I assume you are referring to primary shards as there are no primary indices, and if this is the case the answer is no, as this is not possible. Primaries and replicas however do the same amount of work, so you should not need to be concerned with this.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.