The prior discussion on distributing primary shards were more on performance and rebalance.
If the primary shards are not evenly distributed to all data nodes in the cluster, more load is on nodes with primary shards for routed search queries with preferences _primary or _primary_first.
The same applies to routed search queries with preference _replica, _replica_first. Those queries only go to replica part of the clusters. Either way, I cannot enable those preferences without risking lopsided load to the cluster.
Is there a way to guarantee primary shards are evenly distributed?
The use case is very similar to the original use case for _primary and _primary_first. We have different index readers in various Storm bolts search the index after the index being updated from web apps (all in different threads). The update-notification-search is close to near real time. We have seen quite a few times that an index was updated a few seconds ago (sometimes more than tens of seconds) but the search result comes back with the old view.
We are using version 1.7.5 with synchronous replication. The refresh rate is not configured and I believe the default value is 1 second.
We are in the process of moving to 5.5. At the same time we are looking for ways to deal with this near-real-time requirement. Using _primary and _primary_first is one of the proposed solution. However the distribution of load is a concern. When I looked at the shard distribution in the cluster, I noticed the distribution of primary-replica was far less than uniform. I have seen some shard allocation with nodes with all primary shards and nodes with all replica shards.
I think what you are looking for is the wait_for option on the indexing request. that will guarantee that you will see the docs on the next search. Balancing based on the primary is considered a bug and should almost always be fixed with something else.
I have looked at wait_for. But for the nature of streaming traffic in Storm topology, wait_for is not a good option.
The system is a Staged Event-Driven Architecture. API layer is a facade. Indexing is done in a Storm topology. Down streams of indexing topology search the index once the notification of index updated is received. While indexing is done in batch (bulk index requests), using wait_for increases indexing topology latency and introduces uncertainty to the reliability semantics of the reliable topology.