After enabling zoning on a Production Cluster recently we have started noticing reduced response times for search / scroll requests.
Version : 2.4
Cluster : 18 Nodes (12 data 8GB + 3 master 1GB+ 3 coordinator 8GB)
Data Volume: ~ 500 Mil records - Data (Not logs)
Data Type : Time series - Daily Indices (Indices older than 2 months are merged into monthly indices)
Indices : 500 Indices / 1000 Shards / 2004 Segments / 620GB Size
Index Settings : 2 shards and one replica (now with zoning enabled we want the replica in that zone).
Hot-Warm architecture - Indexing and searching happens on one zone only replicas are created on other zone.
TransportClient is used for both Indexing and Searching
elasticsearch.yml : All settings are default settings.
Indexing throughput was 18K per second without zoning and now it is 10K per second. (Is this expected for cross-site replication?). Sites are connected with 10GBPS pipe and initial replication was quite fast.
Search requests that took less than a second are taking more than 3 seconds. Scrolling through an index has shown a similar impact.
I have tried the query parameter preference=_only_nodes:zone:ZONEA which shows a remarkable improvement from Rest Client however on the bulk runs using Transport Client does not seems to have any effect.
As discuss is blocked from my org I could not share the cluster stats. May I know if there are any specific practices / recommendations for Hot-Warm replication using zoning with preference query param? What preference is most suitable for searching and scrolling? and should I use it while indexing also?