Cold data node search performance

We are wondering regarding our search performance in our cold data nodes.
We have a use case of searching a week which is built of 7 indexes (index per day) each index has 38 shards 22GB each. The query timeout after 60 seconds. During these 60 seconds, the cold nodes reached a load of 13-15.

I have a few questions regarding best practices:

  1. Usually, the best practice is to allocate 50% of the memory to java heap. Is that also correct for cold data nodes that are not indexing any data?
  2. Will it be better to use machines that are memory optimize meaning fewer cores more RAM ratio??
  3. Is it better to use many small machines of a few large ones? e.g., 8 machines 8 cores 32GB mem or 32 machines with 2 cores 8GB (or 16GB according to the answer to the previous question)
  4. What is the max ratio between memory to the number of indexes/shards or data on disk
  5. Any other tips regarding cold search performance will be appreciated :slight_smile: .

Cluster info:
elasticsearch 6.2.3 cluster consists of:

  • 3 master nodes (4 cores / 14gb memory / 7gb heap)
  • 3 client nodes (same setup)
  • 40 hot data nodes (8 cores / 64gb memory / 30.5gb heap and 1.4tb local ssd disks)
  • 8 cold data nodes (8 cores / 32gb memory / 16gb heap and 5tb spinning disks).

Cluster contains ~14,000 primary shards (25,000 total active shards) spread across ~8,900 indexes
Currunlty in use
28TB in hot storage(SDD)
25TB in cold storage(HDD)

Thanks for your help!

What was limiting the performance of the cold nodes during this query? CPU? I/O? GC?

@Christian_Dahlqvist cold on peak:

  • CPU - 78% - 8 cores
  • RAM - 61%
  • Load avg - 13.4
  • 1sec - disk write

What do you mean with this? Did you see any increase in iowait while the query was running?

@Christian_Dahlqvist When a large search query is running, we see that the io usage time is around 1s


Not sure how to interpret that, but it looks like you may be limited by disk performance. Can you run iostat -x while the query is running?


Hi @Christian_Dahlqvist any thoughts\sugestions on the flowing:


The amount of data you can hold on a node is often driven by heap usage and how well you can minimize overhead, so for cold nodes I would go for as large heap as possible (~31GB). If you have more than 64GB on the node, the additional page cache can help improve performance, but it is hard to quantify this as it likely depends on your query patterns.

This depends on what is limiting performance. Often it can be the performance of your storage, so it is hard to tell what effect different node types would have.

No, I don't think so.

For cold nodes you typically want to minimize shard and index overhead, which means that you generally want quite large shards that you have run force merge on. Exactly how much you can hold on a node depends on how much heap overhead you have per shard and how much heap you need to have free for querying.

  • Optimise mappings to reduce field data usage
  • Ensure you have as large shards as possible that have been force merged. Larger shards have less overhead per data volume, but can result in slower queries, so benchmarking this and adjusting the size based on query latency requirements is important.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.