Requesting feedback on benchmark results: Hot vs Frozen(Cache) vs Frozen(No-Cache)

dipayans · September 23, 2025, 1:34pm

Hi everyone,

I recently ran a set of benchmark tests to evaluate performance across Hot tier and Frozen tier searchable snapshots (with and without cache). I’d really appreciate feedback from the community to understand if the results I’m seeing are in line with expectations, or if I might have missed something in my setup.

Lab Setup:

Cluster: 2 data nodes
Node Specs: 8 vCPU, 32 GB RAM, 10TB SSD each
Network: 100 GbE
Tiers tested:
1. Hot tier (indices on SSD)
2. Frozen tier with cache (SSD-based shared cache)
3. Frozen tier without cache (direct reads from object storage)

Benchmark Workload:

Dataset: nyc_taxis track (via Rally)
Queries: mix of search and aggregations (default search, range queries, histograms, etc.)

Results (summary):

Metric Category	Sub-task	Hot Throughput (ops/s)	Frozen No Cache Throughput (ops/s)	Frozen Cache Throughput (ops/s)	Hot Median Latency (ms)	Frozen No Cache Median Latency (ms)	Frozen Cache Median Latency (ms)	Hot 99th Latency (ms)	Frozen No Cache 99th Latency (ms)	Frozen Cache 99th Latency (ms)
Default Query	default	3.03	3.03	3.03	6.08	7.65	7.09	33	35.9	34.98
Default (Larger Result Size)	default_1k	3.03	3.03	3.03	9.58	9.76	9.71	12.17	14.5	13.17
Range Query	range	16.8	1.18	2.29	56.4	407.7	412.8	80.7	472	523.9
Aggregation – Histogram	date_histogram_agg	81.39	194.26	209.17	11.86	4.57	4.22	15.7	7.78	6.33
Aggregation – Auto-histogram	autohisto_agg	6.13	0.45	1.66	160.85	598.3	601.98	183.07	692	646.25
Aggregation – Distance / Complex	distance_amount_agg	0.32	0.06	0.07	3109	14900	14984	3250	15200	15295

My observations:

Hot tier delivers the best throughput and lowest latency across almost all queries.
Frozen(Cache) avoids object storage reads (everything seems served from local SSD cache), but still performs closer to Frozen(No-Cache) than Hot.
This suggests that even with SSD cache, there’s noticeable overhead in the Frozen/searchable snapshots layer.
Histogram aggregations were an exception, where Frozen(Cache) actually outperformed Hot.

My questions to the community:

Based on my hardware (8 vCPU, 32GB RAM, SSD, 100G network), do these results look satisfactory?
Is this the typical performance gap others have observed between Hot vs Frozen(Cache)?
Are there tuning parameters or best practices that could help Frozen(Cache) performance approach Hot-tier levels, especially since both run on SSD?

Looking forward to your feedback and validation from others who’ve benchmarked searchable snapshots.

Thanks!

Topic		Replies	Views
What are the best practices for reading logs/data from cold or frozen data (more than 3, 6 or 12 months old) Elasticsearch	3	367	December 1, 2021
Data tiers vs searchable snapshot [platinum license elastic onprem] Elasticsearch	15	1251	March 18, 2024
ILM Hot Tier Searchable Snapshots Elasticsearch	5	297	December 7, 2022
Searchable snapshot Elasticsearch elastic-stack-searchable-snapshots	2	326	February 3, 2025
Frozen Tier Elasticsearch	5	1264	April 26, 2022

Requesting feedback on benchmark results: Hot vs Frozen(Cache) vs Frozen(No-Cache)

Related topics