I am trying to load data to Elastic, here is the code I use
public static void loadJSONFileAsync(RestClient restClient, String index, String type, String jsonFileName) throws IOException {
String postData = String.format("/%s/_bulk", index);
ResponseListener rl = new ResponseListener() {
@Override
public void onSuccess(Response response) {
try {
successCnt++;
phaser.arriveAndDeregister() ;
} catch (Exception e) {
}
}
@Override
public void onFailure(Exception exception) {
if(exception!=null) {
System.out.println(exception.getMessage());
}
failedCnt++;
phaser.arriveAndDeregister() ;
}
};
while ( (line = reader.readLine())!= null) {
// reading the file here
buff = readFromFile();
if( counter == BATCH_SIZE ){
entity = new NStringEntity(buff.toString(), ContentType.APPLICATION_JSON);
try {
restClient.performRequestAsync("POST",postData, params, entity, rl);
phaser.register();
} catch (Exception e) {
e.printStackTrace();
System.exit(-1);
}
batchNum ++;
buff = new StringBuffer();
}
}
phaser.arriveAndAwaitAdvance() ;
reader.close();
What I do here is I make a bulk request and I use phaser to control the concurrent asynch requests so we wait until all of them finish.
There are several problems that I have here - first is that it looks like eventually elastic server instance chokes on the data (I think it happens if I have 13-14 concurrent requests going each having 10k records).
The second problem is when this is called:
public void onFailure(Exception exception) {
sometimes exception is null. Is this supposed to happen?
And the second question - how do you deal with situations like this, I can reduce the number of parallel requests and it seems to work ok, but I guess it's not guaranteed and if the server has higher load it can fail. I don't want to store data on my side to do the retries, is there a way for automatic retries if request fails?
I can also use synch requests, it looks like there is not that much difference in time.