Pushback to hadoop


#1

When we load data from hadoop into elasticsearch, we keep seeing errors in the tasks like this:
org.elasticsearch.hadoop.EsHadoopException: Could not write all entries [99/347072] (maybe ES was overloaded?). Bailing out...

Since our hadoop cluster can load/read data at an enormous rate i am not surprised our (much smaller) elasticsearch cluster can not keep up. Fair enough. So this question is not about optimizing elasticsearch for faster indexing.

My question is: why can elasticsearch not do some kind of pushback to slow down the hadoop job to a speed that is acceptable for elasticsearch? It seems elasticsearch will happily keep on ingesting data at a rate it simply cannot sustain...


#2

oh i just reaized there is a forum for hadoop related stuff. moving this over to that one...


(system) #3