Implementing ES index creation and search in parallel on the same node

Bhargavi_Sri · April 28, 2017, 5:54am

Hi guys. I really am trying to understand Elasticsearch. The code that I have written creates 35 parallel python scripts that all try to create indices on ES cluster simultaneously. But this is happening in serial and I am unable to implement parallelism to make my code run faster.

Is this because ES does not allow multiple scripts to index at once?

Christian_Dahlqvist · April 28, 2017, 6:34am

Indexing can be done in parallel by multiple threads, but creating indices have to update the cluster state and therefore need to be serialised through the master node.

Bhargavi_Sri · April 28, 2017, 7:12am

But in my code I have the creation of index and searching linked together. So is there nothing I can do in this case?

Christian_Dahlqvist · April 28, 2017, 7:36am

Creating indices is usually not done very frequently, so the fact that the cluster state need to be updated is generally not a problem. What exactly is it you are trying to achieve?

Bhargavi_Sri · April 28, 2017, 7:44am

I do not have the need to store the data on the node. So I am creating indices of size 10k, performing the search on them and then deleting it before I create another index. Since match query is not able to give me all the results of the search query, I am creating the index on 10k records and retrieving all the results before I move on to create another index.

For example, if the data size is 20k, I am creating 2 indices through a for-loop. Deleting the first one after the search is done

system · May 26, 2017, 7:58am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ElasticSearch and parallelization Elasticsearch	2	2026	July 6, 2017
Parallel document processing across nodes? Elasticsearch	2	1065	May 14, 2018
Search multiple indices Elasticsearch	4	381	January 2, 2022
Multiple data directories ->parallel search of shards on same instance? Elasticsearch	6	3401	July 5, 2017
ES Indexing take huge time Elasticsearch	6	1630	July 5, 2017

Implementing ES index creation and search in parallel on the same node

Related topics