How to run multiple indexing threads on a single machine?

I have written an Elasticsearch plugin ( analyzes the UserAgent string ) that works the way I have in mind when I put it in a pipeline. (see Elastic Search | Yauaa - Yet Another UserAgent Analyzer )

Now on my laptop I want to put a rather large dataset (~100M records) in Elasticsearch via this plugin and check the results in Kibana.
I have put together some scripting ( yauaa/devtools/analysis at main · nielsbasjes/yauaa · GitHub ) that starts Elasticsearch with the plugin installed using Docker on my Ubuntu machine.
I then define the pipeline and load the data.

Functionally this works.

The problem I have is that when I do this I have been unable to make this pipeline run in multiple threads and thus reduce the time I have to wait. At the moment it is only using ~2-3 cpu cores where my laptop has 12 (6+hyperthreading).
I am doing 8 bulk updates at a time with ~ 100000 records in each batch.

What config setting can I change to get ES to actually use multiple threads for this pipeline so that it uses ~10 CPU cores?

The only way I can think to influence this would be to run multiple nodes on the host.

Maybe someone else can comment though?

This clarifies a lot.
I'm going to try a different approach now.
Thanks!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.