We have a cluster that will contain a mix of hosts, some with spinning disks and some with SSDs.
We would like to index only to the hosts with spinning disks, and serve queries only from the hosts with SSDs.
How can we force primary shards to only be on the hosts with spinning disks, and replica shards to only be on hosts with SSDs?
I can see it's possible to influence where any shards of an index, both primary and replica, go, but can't see a way to distinguish where shards go on the basis of them being primary or replica.
I realise that - if it is indeed possible to route shards on this basis - in the case of a failure, it's likely that a primary may have to go to a SSD host and/or a replica may have to go to a spinning disk host, and therefore maybe this is not possible at all. But any ideas welcomed.
I guess we will have to rely on search preferences (eg, _prefer_nodes:xxx) to direct queries to the SSD nodes, and use shard allocation awareness to distinguish the spinning disk nodes from the SSD nodes, so the SSD nodes always have a complete copy of the index (albeit made up of both primary and replica shards).
Would that work? Are there any other options we can look at?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.