I have a huge dataset with over 2 million documents and growing. I need to update these documents regularly, and the update is pretty heavy.
My cluster consists of two data nodes (let's call them A and B) and a number of non-data nodes that run on the same machine as their corresponding web servers. So requesting a web page from the web server results in a request to ES on localhost which then gets propagated to one of the data nodes. Every shard has one replica.
I am considering to dedicate data node A to updates and data node B to queries by non-data nodes. I imagine this would make queries run on slightly outdated data, but at least data node B would be free to process them.
Is it a good idea (or even possible)? If so, how can I instruct non-data node C to propagate its search requests to node B only?