Good day, everyone!
I'm currently working on a tool that would spread indices in our cluster to spread indexing load more optimally, as inspired by this article (Optimal Shard Placement in a Petabyte Scale Elasticsearch Cluster - Meltwater Engineering Blog).
I have quite a few indices with very uneven write-load between them. So it seems like this would make sense for my index.
The problem I'm seeing is that two write-load markers I'm observing aren't necessarily following each other. Those two markers are increase of thread_pool.write.completed_tasks
and increase of sum of indexing.index_time_in_millis
over all the shards allocated on node in question.
My question is which one of those provides better optimization function for my use case.
My broader question is: Is it possible to understand what kind of tasks Write Threads are executing. I understand from docs that those are indexing and bulk tasks, but bulk I'd like to learn some details about that. Would bulk
request produce task for Write Thread on a node request hits? Will actions from bulk request produce more tasks or the whole bulk request would produce just one task?
Thank you in advance.