Data ingestion to cluster from multiple sources


I am starting to test my new 3 nodes ES cluster, but it is my first experience with it.

In production cluster will receive data in real time from 7 sources to the same index.

I expect 1M entries per data source so about 7M daily.

Which is the best approach to distribute load?

I was thinking of pointing data sources to different nodes like 3 sources to master, 2 to node 2, 2 to node3.

The alternative could be to set-up a load-balancer in front of ES.

Any comment or suggestion is very appreciated.


May I suggest you look at the following resources about sizing:

Specifically the slides I linked to. Look from slide 49.

If you have 3 nodes in the cluster and they are hosted on the same hardware, I would recommend setting them up with default configuration (master/data) and distribute load evenly across them.

Thank you, I will go this way.

I will read this docs.

Thank you.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.