Using E.S 1.6.0
Hi I'm running a bulk process with the following java/vertx code. I tried bulking 192,000 documents but only 20,000 got indexed. This used to be fine I have indexed over 1.3 billion documents.
I'm looking at Bulk Thread pool in Marvel...
Bulk Thread Pool Count: 30 per node
Bulk Thread Pool Reject: 0 per node
Bulk Thread Pool Ops/sec: 3 per node
Bulk Thread Pool Largest Count: 30 per node
Bulk Thread Pool Queue Size: 0 per node
None of the logs report any throttling.
I have 20TB of disk storage of 13TB used (including replicas). So I'm way below the watermark for disk.
What else can I check?
Index settings:
{
"index" : {
"refresh_interval" : "30s",
"translog" : {
"flush_threshold_size" : "1000mb"
},
"number_of_shards" : "8",
"creation_date" : "1435262728426",
"analysis" : {
"analyzer" : {
"default" : {
"filter" : [
"icu_folding"
],
"type" : "custom",
"tokenizer" : "keyword"
}
}
},
"number_of_replicas" : "1",
"version" : {
"created" : "1060099"
},
"uuid" : "KtNOhb4qS6eFjM8BUxc9HA"
}
}
Java Bulk Code:
JsonObject body = message.body();
JsonArray documents = body.getArray("documents");
BulkRequestBuilder bulkRequest = client.prepareBulk();
final Context ctx = getVertx().currentContext();
for(int i = 0; i < documents.size(); i++)
{
final JsonObject obj = documents.get(i);
final JsonObject indexable = new JsonObject()
.putString("action", "index")
.putString("_index", obj.getString("index"))
.putString("_type", obj.getString("type"))
.putString("_id", obj.getString("id"))
.putString("_route", obj.getString("routing"))
.putObject("_source", obj);
final String index = getRequiredIndex(indexable, message);
if (index == null) {
return;
}
// type is optional
String type = indexable.getString(CONST_TYPE);
;
JsonObject source = indexable.getObject(CONST_SOURCE);
if (source == null) {
sendError(message, CONST_SOURCE + " is required");
return;
}
// id is optional
String id = indexable.getString(CONST_ID);
String route = indexable.getString(CONST_ROUTE);
IndexRequestBuilder builder = client.prepareIndex(index, type, id).setSource(source.encode());
if(!route.isEmpty())
builder.setRouting(route);
bulkRequest.add(builder);
}
bulkRequest.execute(new ActionListener<BulkResponse>(){
@Override
public void onResponse(BulkResponse resp) {
message.reply(new JsonObject().putString("status", "Took: " + resp.getTookInMillis() + ", Indexed:" + documents.size() + "," + resp.getItems().length + ", Failed: " + resp.hasFailures()));
}
@Override
public void onFailure(Throwable t) {
ctx.runOnContext(new Handler<Void>() {
@Override
public void handle(Void event) {
sendError(message,
"Index error: " + t.getMessage(),
new RuntimeException(t));
}
});
}
});
Basically I bulk a bunch of Vetx.io JsonObjects into an array and then finally bulk them to Elasticsearch.
bulkRequest.execute() does not return any error. resp.hasFailures is always false. And both my document.size matches resp.getItems.length.