I read in one of the posts that splitting up the index into more shards,
generally improves indexing performance. Is this because all shards of the
index are read at the same time simultaneously when doing a search?
#2, So if I have 4 nodes (node1, 2 3 and 4) and 1 index with 4 shards, 0
replication, if I do a search against node1 with data that resides on node
4, how will this work? or does the shard on a box, only contain the data it
has locally.
Sorry if this doesn't make sense. New to elasticsearch.
If you have 4 nodes, and 4 shards no replicas, then you will have a shard
allocated per node. When you execute a search, it will be broadcast to all
shards, execute on each one ("locally"), and then aggregate the results.
Several shards improve indexing performance if you have enough machines to
have the shards allocated on them. This is simply because indexing
operation, different than search, just goes to one shard, so the indexing
process will end up split across more nodes.
Note, with routing, you can actually control even for search on which shards
a search will execute on.
I read in one of the posts that splitting up the index into more shards,
generally improves indexing performance. Is this because all shards of the
index are read at the same time simultaneously when doing a search?
#2, So if I have 4 nodes (node1, 2 3 and 4) and 1 index with 4 shards, 0
replication, if I do a search against node1 with data that resides on node
4, how will this work? or does the shard on a box, only contain the data it
has locally.
Sorry if this doesn't make sense. New to elasticsearch.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.