Hi. I setup my cluster with only i3.2xlarge nodes and considering change to hot warm architecture.
With all of i3.2xlarge nodes, I don't have big issue to search documents.
But I'm worring about d2 type instance's search performance because it has spinning disk.
Does search performance of d2 type instance same with i3 type instance if d2 type instances aren't indexing?
For example about indices, there are daily 150M documents, 400GB(primary shards) for apm transaction indices and 700M documents, 300GB(primary shards) for apm span indices.
Number of primary shards are 20 for each indices.
Usually my elasticsearch node's CPU utilization is 20%~40%
It will really helpful if you tell me experience about hot warm architecture.
In a hot-warm architecture it is often assumed that the most recent data also is the most frequently queried. Once data migrates over to warm nodes after a few days there are less searches on it and this combined with the fact that warm node does not handle any indexing allows them to efficiently serve queries with good latency (as they are considerably less busy as well) even if the storage does not have the same I/O performance as for hot nodes.
d2 instances are commonly used as warm nodes, but whether d2 instances provide enough performance for your particular use case or not will depend on your data as well as your query patterns, so your best bet is to test.
Having 20 primary shards for 300-400GB of data sounds a bit excessive, so if you have a reasonably long retention period in your cluster, it might be a good idea to decrease this a bit so you get fewer and larger shards.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.