We have a case where we index documents into timebased indices in a certain pattern: events-2016.10.13 for example.
We have 10 data nodes and 30 shards with 2 replicas and we are using a routing key to index and query documents.
I noticed a weird behavior where the same shard number is always allocated to the same node, so for example for today's index, shard number 1 is allocated to node number 1 and the same is true for yesterday's index and the one before etc..
Is this the normal behavior? Since we're using a routing key, and the same documents are always indexed to the same shard number (according to the docs, the calculation should always yield the same shard number to each routing key), and considering the above behavior - this means that when we run a query with a certain routing key for 30 days back, all the work is being performed by the same node rather than using the entire cluster. Same goes for indexing and so forth.
I also noticed that if we restart the nodes, they are being spread around in a different manner across the nodes.
Is there a way to tell Elasticsearch NOT to place shard number 1 always on node 1?
Then why not just delete the indices?
That's the whole idea of why we suggest them, because it's easier to delete an index as opposed to doing a delete by, or similar.
Not sure I follow. I have time based indices as many others do. I also use routing to save some resources by using routing so that I can enjoy some extra performance boost by storing all documents for a specific client in a single shard on the index.
I do delete old indices, I keep the last 30 days of indices available and delete older indices.
I use daily indices AND also for each index I have a routing key by the client id. Is there a conflict between the two approaches?
EDIT:
My original question was regarding the shard allocation between different indices, I can see that there is a pattern in the way ES is dividing the shards for each index on each node. It looks like regardless of the index, for example shards 1, 2 and 3 will always be on node 1 while shards 4, 5 and 6 will always be on node 2. Is this the desired behavior? Is there a way to change it? I'm asking because each client will also be always on the same shard number due to the way ES is calculating the routing of the document and the decision to place it in a specific shard. So eventually, if both claims are true, one node will do all the heavy lifting for a specific client rather than have different nodes working together when I query on multiple indices (let's say, last 30 days).
The shard balancer does not take shard numbers into account when balancing shards across nodes. The reason shards 1 to 3 ended up on node 1 and shards 4 to 6 on node 2 is just due to the order in which shards were allocated to the nodes (which is somewhat deterministic). If you move shard 1 to node 2 and shard 4 to node 1, ES won't move them back, however.
I agree that this use case is currently not well supported by the balancer. Either you do manual balancing of the shards or you have to chose another approach. Also note that this is just an issue if the query load is not spread evenly among clients. If you assume a constant query load for all clients, the load will still be spread evenly among the nodes.
Thanks for the quick reply @ywelsch, I get what you're saying about the spread of the load, I agree. In my specific use case it does have some affect since I have a low stream of clients loading up a dashboard which executes a bunch of queries on different time spans with their own routing key, so that might slow me down a bit.
Is there a way, on index creation, to tell ES where to place each shard? How would you recommend we attend this issue specifically in our use case? If I cannot tell ES on index creation time to place shard X on node Y, should I simply reallocate shards randomly across nodes as soon as possible after shard creation? (so that the reloaction process will have a smaller impact on general cluster performance?)
There is no way to tell ES where to place each shard on index creation (you would have to disable shard allocation altogether and manually allocate all shards - a big hassle).
The best solution in this case is not to use routing but to just rely on term filters to select the right client. Term filters are crazy fast and, due to caching, the speed difference to custom routing in this case might be negligible. Try it out.
We've tested term filters but it was no match against routing, routing performed much better. As I understand it, routing also saves a lot of time in merging the filtered documents from different shards (is that accurate?), and since we have 30 shards for each index, that combined with relatively small shard sizes and the fact that results from only 1 shards needs to be collected makes the difference I guess.
As a workaround you could try to create indices with a number of shards that's not a multiple of the number of nodes, they might then spread more evenly.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.