Hello,
I have a question about sharding strategy and performance please:
when creating an index, the default number of shards is 1 and 1 replica.
when creating an ILM policy, the default rollover is 30days/ 50gb.
So when working with the default, I'll have indices of 100GB (primary + replica).
so far so good.
BUT: I saw in an article (sorry I lost the link), that smaller shards(not too many) will gain better performance.
my questions are: (sorry for sending them at once)
isn't a 50 GB shard too much? (I know it's the recommended maximum limit, but why is it the default?)
if I split it into 2 indices of 25GB shards, should queries work faster (in the avg case..)
if I already have an alias to indices (for example logs-alias: logs-00001,logs-00002 etc), can I use the split api to split all of them? how should I do it technically please, so the alias will still server queries?
will adding few replicas give the same performance improvement (query no insert) as splitting the indices to more primary shards?
This depends on the use case and also possibly on how much data you ingest and how long you keep it. In my experience it is much more common to have performance due to too many small shards than from toom large shards.
Each shard is queried in a single thread so the size of the shard affects the minimum latency that can be achieved. Querying 2 shards of 25GB is likely faster than querying one 50GB shard as two threads can be used. It is also possible that querying 20 25GB shards is faster than querying 10 50GB shards. At some point you will however reach a point where querying a larger number of shards start showing worse performance. Exactly when this occurs depends on a lot of factors, so you need to test what is right for your use case.
Adding replicas is generally done to add resiliency or increase the number of queries per second the cluster can handle. I would not expect it to affect latency much.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.