I have a cluster with 8 nodes. I want to index a lot of docs. For the best performance,
how many workers (to parallel bulk indexing) must run on each node?
It depends on a number of factors, e.g. document size, number of shards indexed into, mappings, hardware and bulk size. I would recommend running a benchmark with as realistic data and settings as possible.
If N is the best number, I must send N parallel bulk to one node or split them between all nodes?
It will be the optimum for the way you tested it. I would recommend sending bulk requests to all data nodes that do indexing though.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.