Query response time when search query hits across HOT and WARM phase in ILM

Hi Team,

We have to implement HOT, WARM and COLD phase with ILM implementation as a part of our requirement, wherein the HOT phase Nodes are in SSD and the WARM and COLD are not.

Most of the searches will be fired for recent documents and usually will hit the HOT Phase.
I am hoping to get a quick search response time for the HOT Phase due to the underline hardwire.

But if any query is fired where the search is across the HOT and WARM phase, will the response time be equivalent to that if a query is fired in the WARM phase or the
performance will be slight faster since data from HOT phase would return sooner ?

Any help would be appreciated.

All shards need to return results before the response to the query is sent, so latency will be driven by the slowest shard.

Note that although the HOT nodes do have much higher IOPS, they also generally handle all indexing which is IO intensive. As WARM nodes do not handle indexing, they can use most available IOPS to serve queries, so the difference in latency may not be as great as you may think, although this does depend on what type of storage you are using on the WARM nodes.

I believe technically is it possible to write to a warm node via index shard allocation filtering (index level).

It is technically possible, but the whole point of a hot-warm architecture is generally to avoid that.

This is (at best) a confusing thing to post in this context.

Index shard allocation filtering has nothing to do with which index receives writes.

1 Like

Expanding on the response

All shards need to return results before the response to the query is sent, so latency will be driven by the slowest shard.

a bit more. This covers the basics of how Elasticsearch executes a search.

One thing to note that isn't covered in the above document is that Elasticsearch does support asynchronous searches, and if you are concerned about getting something back in a timely manner and are fine with initially returning partial results, this might be a valid option.

Thanks David. I believe using shard allocation filter may be leveraged to write directly to the warm tier (as long as the index is open). If I'm wrong (which I'm more than willing to learn and stand corrected), please offer a little more substance than subjective nature corrections. The community and I can learn from you on why my response is incorrect so we avoid solution mistakes. When I read in this posts that "warm nodes do not handle indexing", I offered that it is technically is possible.

Shard allocation filtering determines where shards are located, not whether they are writable or not. In a hot-warm architecture the aim is generally to get all the IO intensive indexing load to occur on the hot nodes that typically have significantly faster disks and can handle large amounts of writes in addition to queries againt the most recent data. Warm nodes tend to have more data and slower disks. The aim here is to have as much as possible of the likely more limited IOPS serve reads. Performing indexing on warm nodes can result in a lot of IO and quickly affect query latencies.

@Sunile_Manjee My statement that warm nodes do not handle indexing should therefore probably instead read warm nodes should not handle indexing.

For those curious minds, you may find this interesting. chat gpt response:

@Sunile_Manjee What exactly does that add to the discussion? Please enlighten me.

As far as I can see based on my experience that response is both ambigous and misleading. I would recommend you refer to the official documentation when learning how the Elastic Stack works rather than use ChatGPT. It may require a bit more reading and effort, but is well worth it.

Shard allocation filtering is a mechanism to control where shards get allocated based on node and shard settings. It can as far as I know not take disk space usage into account as ChatGPT suggests. It can also not determine in itself whether indices can be written to or not.

Index lifecycle management is the feature used to control how indices are managed over time. This can behind the scenes use shard allocation filtering to relocate indices from one set of nodes to another in a hot-warm-cold architecture at specified points in the lifecycle. It also supports changing settings and performing operations on indices at these transition points, e.g. shrinking or forcemerging. Some of these actions are common to perform (but not necessary) when a newly created index is moves from the hot zone to the cold, and these may result in the index becoming read only. If no such actions are specified the index will technically be writable all the way until it is deleted.

In order to get the most out of a cluster with a hot-warm-cold architecture where the nodes in the different zones have specialised hardware profiles, it is as I stated above desirable to try to keep all indexing load on the more powerful hot nodes.


Whether data should be written to the warm tier or not is use case specific. I believe we all agree writing directly to warm tier is possible.

My only reason for me chiming in was to call out the fact it is possible to write to warm nodes. Placing shards on warm nodes via shard allocation filter. I’m researching/verifying on my side if that statement is accurate.

There are a few blog posts around hot-warm architectures, and although some of them are quite old I believe they are still accurate as far as the description of node roles go. If anything have changed since these were written it would be useful to get an note on them pointing this out.

@Sunile_Manjee I would say that best practice is to not write or update indices in the warm or cold tier, and the blog posts seem to also point in that direction. There are a lot of people reading these forums looking for guidance, so I believe it is important to be clear on what is considered best practice and what is not. Did you check internally whether the guidance in the blog posts is still accurate or whether best practices have changed in more recent versions?