How hot/warm architecture enhances query performance?

In our current running architecture we are not having hot /warm nodes. So all open indices are almost equally distributed across all data nodes. All our queries specify multiple indices to query, we dont use wildcards.

In our new cluster we are planning to use hot warm architecture with dedicated hot nodes where we will keep around 10 of recent indices and rest of indices would be on warm nodes.

Now the question -

how will hot/warm architecture or availability of limited number of indices on hot nodes would enhance

  1. Query performance
  2. indexing performance
  3. Overall cluster health.

Cluster STATS
Data Usecase - Significant high reads/writes on approx 7 recent indices and significantly less reads on older indices. even lesser writes on older than 7 indices.

Read traffic - 65000 request per minute during peak.
write traffic - 1500 request per minute
total indices - 90

Old cluster = 21 data nodes , 3 client+master nodes
elasticsearch version 1.5
each index size 30 gb
no of shards - 4
replication factor 1
recent indices have increased replication factor 3 to enhance read searches.

new cluster being built = 3 master+3client + data [x hot+ y warm] nodes
elasticsearch version 6.2
node types - hot/warm
Index Size - 30 Gb
no. of shards- 4

Are you using time-based indices? How long time period does each of these cover? How many different types of indices do you have?

What does your query patterns look like? Are you primary querying recent data or does queries typically address longer time periods? How are you querying the data?

they are based on modulus of ids [ids are auto increment ids of mysql database]
for example DASHBOARD-[id%50000000] . so each index would be having at max 5 crore documents . Ids are growing at the rate of around 60-70 lakh per day. So we are making a new index every 10 days or so.

Read trend is given below - where max queries are on recent indices decaying as the indices get old . This trend is of index usage pattern for a queries with request count on Y axis and index names on X axis.

Write trend is - 90% writes on recent 10 indices. and very very small writes on earlier indices.

Read queries generally work in sequential manner starting from latest indices , working back towards the earlier indices but the count of queries on earlier indices reduces with time.

I hope what I've written clarifies .

If I understand correctly you are inserting mostly to the last 10 indices, which cover approximately 100 days (10 days each). Each index has 4 shards and you grow by about 3GB per day (30GB across the 10 days an index covers). Since you have 90 indices in the cluster, this covers approximately 900 days worth of data. Is that correct?

Also, what has prompted the change in architecture? What problems in the current cluster are you trying to address with this change?

Also please use units that are widely recognised. I have no idea what a lakh or a crore is, so had to look it up.

If I understand correctly you are inserting mostly to the last 10 indices, which cover approximately 100 days (10 days each). Each index has 4 shards and you grow by about 3GB per day (30GB across the 10 days an index covers). Since you have 90 indices in the cluster, this covers approximately 900 days worth of data. Is that correct?

YES. to all above points. 900 days worth of data and increasing.

Also, what has prompted the change in architecture? What problems in the current cluster are you trying to address with this change?

Current elasticsearch cluster is running on version 1.5 and we are upgrading to new es cluster running 6.2 version. We believe that the version upgrade will have multi fold benefit on cluster health and performance. Also it will help in resource optimisation as we are running 24 instance of old es instances.

by segregating writes on hot instances we are planning to separate out reads on old indices and index keeping overhead of old indices from hot instances. This is with the assumption that each lucene index [shard] takes up some resources to run on. this would dedicate the hot instance for new /recent indices.

we are trying to use hot warm architecture to optimise the case of less reads on old indices. We are planning to use magnetic disk on cold or even on warm instances. and provisioned ssd gp2 disks for hot es instances.

currently we have closed very old indices on existing es 1.5 cluster. we are planning to keep them open on new es 6.2 cluster.

Also please use units that are widely recognised. I have no idea what a lakh or a crore is, so had to look it up.

my bad for using this system. will definitely keep in mind going forward.

Given that your indexing rate is not very high, this is probably more of a Warm/Cold architecture than a Hot/Warm one. The principles are the same though as you have dedicated types of data nodes. As it is not really a typical hot/warm use case, it is difficult to know exactly what the impact will be, so you probably need to benchmark it. I can give some general guidelines though.

When you transition, make sure that you revisit your mappings and make use of doc-values to as great extent as possible as this will save heap, especially on the cold nodes. If you want to hold as much data as possible on the cold nodes, make sure that you have large shards. Consider using the shrink index API to reduce the shard count of your indices and force merging them before you move them to the cold tier.

No problem. Makes it easier to read the issue and causes less confusion.

When you transition, make sure that you revisit your mappings and make use of doc-values to as great extent as possible as this will save heap, especially on the cold nodes.

Can you please refer some source which i can refer for this. I'm not able to understand the phrase "make use of doc-values to as great extent as possible as this will save heap"

I have a doubt
will having lesser number of indices on hot nodes, enhance the query performance on hot indices [residing on hot nodes]. If yes, then what is the reason for that.

[my guess is less indices = less resources used = more resources available for hot indices]
is my assumption correct?

It all depends on what is limiting your performance with the current architecture. Less data on the node means more data can be cached, but to what extent this benefits you will depend on your query patterns. I can therefore not tell you to what extent this will improve performance. This is why I recommend you benchmark it with as realistic load as possible to find out for sure.

Thanks a lot for helping out @Christian_Dahlqvist . will be benchmarking the cluster.

for those who are also looking after this topic , after reading a bit, found Tune for search speed | Elasticsearch Guide [8.11] | Elastic which mentions that

Usually, the setup that has fewer shards per node in total will perform better. The reason for that is that it gives a greater share of the available filesystem cache to each shard, and the filesystem cache is probably Elasticsearch’s number 1 performance factor

this can be a supporting argument towards practicability of hot warm architecture.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.