Scaling ES indexing CPU usage

shaharmor · January 24, 2016, 9:40pm

Hey,

Is it possible to start ES servers that will help with indexing regarding CPU usage? And maybe just send the indexed data back to the server that is actually holding the shard itself?

Or if i try that it will just forward the request to the other server?

warkolm · January 24, 2016, 9:46pm

Data is sent to the node that has the appropriate shard.
The only way around this is to use hot/cold architecture, like so https://www.elastic.co/blog/hot-warm-architecture

shaharmor · January 24, 2016, 9:58pm

yeah thats what we're doing now but its not fast enough

warkolm · January 24, 2016, 10:12pm

Then add more resources.

klahnakoski · January 25, 2016, 12:13am

It is my limited experience that the node with the primary shard does the cpu-intensive indexing work for that shard; the node with the most primary shards has the most CPU usage while indexing. Getting the primary shards to distribute evenly among all your machines is a challenge. I do use the hot/cold architecture: Specifically, my master nodes have no data. I only wish I did not have to learn it the hard way.

warkolm · January 25, 2016, 12:17am

Depends, do you have replicas? Because a replica does the same amount of work as a primary.

Topic		Replies	Views
Distribution of work Elasticsearch	10	370	July 6, 2017
Master CPU high when bulk indexing in ES 5.1 Elasticsearch	4	937	June 28, 2019
Data Node like working Alone Elasticsearch	3	264	July 30, 2022
CPU usages 90% and ES hotthreads dump Elasticsearch	2	461	July 6, 2017
High CPU Usage, not from queries Elasticsearch	1	424	April 20, 2017

Scaling ES indexing CPU usage

Related topics