Optimizing shard placement for indexing

Good day, everyone!

I'm currently working on a tool that would spread indices in our cluster to spread indexing load more optimally, as inspired by this article (Optimal Shard Placement in a Petabyte Scale Elasticsearch Cluster - Meltwater Engineering Blog).

I have quite a few indices with very uneven write-load between them. So it seems like this would make sense for my index.

The problem I'm seeing is that two write-load markers I'm observing aren't necessarily following each other. Those two markers are increase of thread_pool.write.completed_tasks and increase of sum of indexing.index_time_in_millisover all the shards allocated on node in question.

My question is which one of those provides better optimization function for my use case.

My broader question is: Is it possible to understand what kind of tasks Write Threads are executing. I understand from docs that those are indexing and bulk tasks, but bulk I'd like to learn some details about that. Would bulk request produce task for Write Thread on a node request hits? Will actions from bulk request produce more tasks or the whole bulk request would produce just one task?

Thank you in advance.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.