Higher CPU vs Higher Memory. Which helps in what cases?

mattwelke · June 14, 2021, 3:58pm

@Aurel_Drejta Thanks for following up here. I subscribed via email so I could lurk. I'm not surprised that disabling caching improved the quality of your benchmarks. One thing I think you should watch out for though is the file system cache. My understanding is that Elasticsearch passively takes advantage of the operating system file cache. So even if you're disabling any active caching that Elasticsearch is doing with the request JSON as the key, it probably won't disable the file system caching, using (I presume) each individual file system path as key. Someone from Elastic could probably step in to confirm whether this is still affecting your benchmark. To be honest, even if it is still affecting it, I don't know how one would account for that. As far as I know, file system caching at the OS level can't be disabled.

Aurel_Drejta · June 14, 2021, 7:00pm

Well disabling caching didn't really improve anything in my case. It's just that I got an actual result from testing the queries in different configurations without relying on caching.

If I run the same query twice, if caching is not disabled the query is returned immediately (200-300 ms).

That's why the caching lead me to my wrong assumption that highlighting was the culprit when in fact it wasn't.

Aurel_Drejta · June 16, 2021, 8:39am

Here is another useful tip when facing these kinds of problems.

For each shard that you have in an index elasticsearch spawns a thread to search in that shard.
That means that if i have a 4 VCPUs in a node i can only search 4 shards in parallel.
See here

So if you have a 5 primary shard monster index with much more VCPUs (16-32) you aren't actually using those VCPUs since no more than 5 threads will be spawned to search the shards of that index.

Increasing the number of primary shards will better utilize those VCPUs since for each shard elasticsearch will spawn a thread and will be searched in parallel.

So that is how more VCPUs help.

How faster CPU helps is that the searches in those shards will be faster.

So if a 2.0GHz CPU searches for 800ms in a shard a 3.0GHz CPU will be much faster at searching in that shard (~ 200-300 ms).

How more RAM seems to help is that with every new index and every new shards that is created, elasticsearch takes some RAM space for every new index and new shard (how much space they take i don't really know).

This guide on shard size helped me to better manage CPU and RAM requirenments.

Aurel_Drejta · June 16, 2021, 12:33pm

And as for highlighting there are cases when it will actually slow down searches.

Two things seems to help with this:

The first being setting "index_options": "offsets" on the field you are going to highlight or "term_vector": "with_positions_offsets" (both of these make your index consume more space).

And the second is using the fast vector highlighter.

system · July 14, 2021, 12:34pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
The more CPU cores/threads, the better performance? Elasticsearch	3	525	April 28, 2020
Hardware configuration Elasticsearch	3	1087	February 20, 2017
Amount of nodes vs. stronger machines Elasticsearch	6	637	July 6, 2017
Designing elasitcsearch node spec - CPU cores Elasticsearch	6	461	April 26, 2019
Data node high CPU Elasticsearch	19	3642	February 26, 2018

Higher CPU vs Higher Memory. Which helps in what cases?

Related topics