Hello,
I was working on ES - hadoop connector and I see that if you server has less memory writes keep on getting dropped.
org.elasticsearch.hadoop.EsHadoopException: Could not write all entries (maybe ES was overloaded?). Bailing out...
As mentioned in Pushback to hadoop from es on bulk load there's no bi-directional communication between Hadoop and the connector - the connector cannot say, there's too much data, slow down.
Does anyone think it might be a good idea to use sth. like Blocking Queues here and add acks while writing (kafka 101) so as to let the consumer (thread on ES) read at slower pace.
Else we would have to tune the batch size, write speed, http timeouts ourself.
I'm open to building/contributing to this its a good idea.