Advice on Scaling writes


Impressed with the 15K/s with 1 filebeat curious what were setting / config is.

For large ingest indices Primary Shards <= Data Nodes, equal to for large indices (within reason). For Smaller you want just leave the defaults. For this case if you are running 2 nodes I would try 2 primaries, 3 nodes, 3 Primaries

Thanks. I will have very large indices on the long run, but was thinking of using rollover to avoid having to scale the shards.

No specific setting other than the ones you suggested earlier in this post. Likely the bulk_max_size and queue.mem helped.
Also the content I have is composed of many small files (from 3k to 4Mb maximum), so that might help with parallelisation.
I'm currently using the CSV processor, but might shift to using the Winston logger for nodejs so I don't need a processor at all (which might increase perf a little as well).


2 shards provides more parallelism than a single shard if you have 2 nodes, that is the only point, if you get enough throughput with 1 shard that is fine.

You can use ILM.
Set Share Size to 50GB
Primaries to 2 or 1