Hi, Ill try again, reformulating my question a bit since I didnt get a
reply on my last try
We have java application with embedded elasticsearch in a cluster of 14
nodes. All the data resides in a central database, and they are indexed in
elasticsearch for querying. A full reindex can be done at any time.
The system are very query-heavy, the amount of writes are small. The number
of documents will not be higher than, say, 300.000.
The size of each document varies greatly, from just a couple of ids, to
extracted text from e.g word-documents of several pages.
I want to make sure that in case of a total breakdown, it should be
sufficient that one or two nodes are available for the system to work.
Write consistency should not be a problem since the master copy of the data
is in the database, and it seems that ES is capable of resolving
conflicting data by using the newest version (which should be all right in
our case)
My first though is to use 1 shard, and 13 replicas. This will naturally
ensure that all nodes have access to all data. This could also be
accomplished by having 2 shards / 13 replicas, so this yield that to ensure
that all data is available, the number of replicas should be the number of
nodes - 1, not depending on the number of shards (which could be anything).
If the requirement of number of nodes are reduced to "2 nodes should be up
at any time", then a shards / replica distribution of "x/number of nodes -
2" should be sufficient.
So, for the questions:
- Are my thoughts about the write consistency not being a problem correct,
implying that ES will be able to resolve conflicts by using the newest
version? - Are my thoughts about the availability and number of replicas correct?
- Given the nature of the application and the number of documents, are
there any recommendation regarding the number of shards? Is it ok to run
with 1 shard / 13 replicas?
best regards
mvh
Runar Myklebust
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.