Scaling ES indexing CPU usage


(Shahar Mor) #1

Hey,

Is it possible to start ES servers that will help with indexing regarding CPU usage? And maybe just send the indexed data back to the server that is actually holding the shard itself?

Or if i try that it will just forward the request to the other server?


(Mark Walkom) #2

Data is sent to the node that has the appropriate shard.
The only way around this is to use hot/cold architecture, like so https://www.elastic.co/blog/hot-warm-architecture


(Shahar Mor) #3

yeah thats what we're doing now but its not fast enough


(Mark Walkom) #4

Then add more resources.


(Kyle Lahnakoski) #5

It is my limited experience that the node with the primary shard does the cpu-intensive indexing work for that shard; the node with the most primary shards has the most CPU usage while indexing. Getting the primary shards to distribute evenly among all your machines is a challenge. I do use the hot/cold architecture: Specifically, my master nodes have no data. I only wish I did not have to learn it the hard way.


(Mark Walkom) #6

Depends, do you have replicas? Because a replica does the same amount of work as a primary.


(system) #7