Hi Otis,
here is my cluster/index growth strategy. I also do not like
oversharding. From my understanding, "oversharding" is when there are
much more shards in an index than the total number of CPU cores can
handle. Assuming the final size of an index or nodes is known, it could
be that a maximum number of shards is used at index creation from very
the beginning, which leads easily to oversharding, if the number of
nodes is not sufficient. The advantage is, growing the cluster is easy -
it comes down to just adding a node. The disadvantage is, the full power
of the cluster is evolved only after the last addition of a node.
In contrast, I tried to find out a reasonable number of shards at
production start with a reasonable number of future migrations. When I
grow the cluster, I try to double the capacity of the whole cluster with
each migration step, so future migrations will be less frequent
(assuming linear growth of data). The advantage is, the cluster runs
almost always with full efficiency. The disadvantage is, migrations must
include index copies or re-indexing. (It reminds me of resizing a hash
table by copying all the entries).
Development phase: single server (node), 1 to 5 shards/index for making
rough sizing/workload decisions
Diskspace: around 1 TB per server, neglected here
Decision: 3 nodes (24 CPU cores per server), 1 index, 12 shards, 1 replica
Workload balance formula for production start "more cores than shards"
(my rule of thumb): total of 72 CPU cores > total of 2*12=24 shards
Growth of factor x: 3*x nodes, (/)*x shards
Workload balance formula for factor 2: 6 nodes, total of 624=144 CPU
cores > total of 224=48 shards
Based on performance metrics, it can be viable to assign more or less
shards per CPU core.
Due to higher query load, a higher replica level can make sense.
Production start: 3 nodes, 1 index, 12 shards, 1 replica
Migration step for cluster growth:
- add new nodes to cluster, shards will relocate
- create new index with n shards (e.g. 2*24=48 or 64 or 72 shards,
depending on measured workload, but less than CPU cores)
- re-index (or copy old index _source over to new index)
- reset index alias
- optional step: detaching old/obsolete nodes with
cluster.routing.allocation.exclude_ip
If fast index recovery is critical, there may be additional constraints.
For example, Lucene index size on disk per shard should not exceed x GB.
Best regards,
Jörg
Am 13.02.13 04:45, schrieb Otis Gospodnetic:
Hello,
Are there any known and good alternatives to handling index (and
cluster) growth other than "Oversharding"?
I see a few problems with oversharing the index:
-
You can't really guess well how your index/cluster will grow, so
you'll always be somewhat wrong even if your cluster doesn't really
grow much or at all.
-
If your cluster keeps growing, then you'll have the ideal number of
shards only at one point in time when the cluster is of just the right
size/fit for the number of shards and you'll have the "wrong" number
of shards both before and after that point.
-
While your cluster is small, this oversharded index means each node
will have a possibly high number of shards. If queries are such that
all shards are queried in parallel, if there are more shards on a node
than CPU cores, there'll be some CPU wait time involved.
I assume the ultimate solution would involve resharding the index
while adding more nodes to the cluster. Is this correct?
As far as I know, there are no plans to implement this any time soon.
Is this correct? I couldn't find any issues...
Finally, are there any viable alternatives to oversharding today?
Thanks,
Otis
Search Analytics - Cloud Monitoring Tools & Services | Sematext
ELASTICSEARCH Performance Monitoring - Sematext Monitoring | Infrastructure Monitoring Service
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.