Does adding data nodes increase write throughput

Question is, all other things equal, and assuming even distribution of shards, does adding data nodes increase write throughput?

My assumption is yes, I'd imagine the new data nodes could take on the burden of dealing with flushing data and merging segments and servicing bulk requests, but maybe I am wrong.

Additionally, what about adding ingest nodes?

If you add more and the shards that are being written to are spread out more, then yes.

Operations like that only happen on nodes that hold the data (ie the shards), or receive the bulk request.

This kinda relates to my first comment.
Yes, if the node getting the request from the client doesn't hold data for the required shard.

Ingest only nodes might be worth looking at if you have a lot of pipelines.

I may not be following, but from what I'm hearing:

  • more data nodes will help spread out work more in general if shards are evenly distributed
  • ingest nodes will also help spread out load, but only for servicing requests and pipelines

Did I get that right?

Yep.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.