Hi there. Well, getting my feet wet with ES. I'm doing a small poc here,
and I created an index with default shard/replica values (did not touch
anything). The only thing I changed was the analyzer:
"albums" : {
"settings" : {
"index.analysis.analyzer.text_en.filter.2" : "porterStem",
"index.analysis.analyzer.text_en.filter.1" : "lowercase",
"index.analysis.analyzer.text_en.tokenizer" : "standard",
"index.analysis.analyzer.text_en.type" : "custom",
"index.analysis.analyzer.text_en.filter.0" : "standard",
"index.number_of_shards" : "5",
"index.number_of_replicas" : "1",
"index.version.created" : "190899"
}
}
So I'm trying to index a small amount of documents using the BulkRequest,
but it's taking forever (almost 1 min). And when it finishes it fails most
of requests, I got only 376 docs indexed out of 1000, and a lot
of UnavailableShardsException[[albums][2] [2] shardIt, [0] active : Timeout
waiting for [1m], request:
org.elasticsearch.action.bulk.BulkShardRequest@db766c1].
I would imagine this is related to the shard config. I'm only running using
a single instance for now.
The code for the bulk load:
while(rs.next() && count < max){
count++;
XContentBuilder content = createContent(rs);
bulkRequest.add(client.prepareIndex("albums", "album"
).setSource(content));
if(count%1000 == 0){
System.out.println("commit");
BulkResponse bulkResponse = bulkRequest.execute().actionGet();
if(bulkResponse.hasFailures()){
for(BulkItemResponse r : bulkResponse.items()){
System.out.println(r.getFailureMessage());
}
System.out.println("ERROR");
}
}
}
How can I make it faster for a single node POC?
Regards