I am planning to have elasticsearch python client on each of my ec2 servers (around 50-60 ec2 servers) send data to my single ES cluster.
Each python client will send bulk json every second to the ES cluster. Total - 50/60 bulk index every second
Each bulk json can have upto ~500 documents / ~ 3-4 MB bulk json. Assuming I am using 20 node cluster m4.large or maybe more.
My question here is
- How will ES cluster load balance the requests coming from different python clients?
- Requests coming from different python clients so frequently, how will that impact my system?
- elasticsearch.index vs Curl to the endpoint, which is better?