We are using ES 2.1. I am not sure its related to ES or not. I am getting huge no of list part inital like
for(Object object: objects) {
Result result = compute(objects);
list.add(result);
}
and list will iterate again and get the detail of each item and then index it. Right now for each time, getting detail for result , a service call went and gave response back and then indexing start. So indexing take overall alot time.
My question is, is that I do parallel execution of multiple thread , then will it be good solution?
Does that effect on Elastic search indexing time. I know this more of java question compare to ES. Looking forward to your inputs.
Start by switching to using the bulk API. Do you have full understanding of what's taking time? Is it the indexing requests or obtaining the information to index? Could the latter be done more efficiently, e.g. by not looking up items one by one?
I am already using bulk api.
I think more time taking by is obtaining information to index. My question, if i use multiple thread to index it, will it effect the elastic search?
I think more time taking by is obtaining information to index. My question, if i use multiple thread to index it, will it effect the Elasticsearch?
I'm not sure exactly what you're asking, but issuing concurrent bulk indexing requests should be fine as long as the concurrency doesn't become too great (at which point ES will start rejecting requests when its thread pools are exhausted).
Well yes. My question about about how ES handle multiple parallel thread for indexing. Is there any documentation. How can I control that? or debug that?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.