Implementing ES index creation and search in parallel on the same node

Hi guys. I really am trying to understand Elasticsearch. The code that I have written creates 35 parallel python scripts that all try to create indices on ES cluster simultaneously. But this is happening in serial and I am unable to implement parallelism to make my code run faster.

Is this because ES does not allow multiple scripts to index at once?

Indexing can be done in parallel by multiple threads, but creating indices have to update the cluster state and therefore need to be serialised through the master node.

But in my code I have the creation of index and searching linked together. So is there nothing I can do in this case?

Creating indices is usually not done very frequently, so the fact that the cluster state need to be updated is generally not a problem. What exactly is it you are trying to achieve?

I do not have the need to store the data on the node. So I am creating indices of size 10k, performing the search on them and then deleting it before I create another index. Since match query is not able to give me all the results of the search query, I am creating the index on 10k records and retrieving all the results before I move on to create another index.

For example, if the data size is 20k, I am creating 2 indices through a for-loop. Deleting the first one after the search is done

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.