Different results because of replicas

Here you go:

  • This old article has some background to the DFS search mode. That feature is designed to overcome any scoring bias across different shards. However it won't help you solve your problem which is scoring differences between replicas of the same shard.

  • For a detailed discussion on what might be the pros and cons of segment based replication (copying indexed docs not raw docs) then see this thread

  • As you've already discovered, the preference parameter can help route users to the same choice of replica to try and give some stability (but ongoing indexing will carry on changing order).

  • For a deep dive on the data structures in the indexes see this talk

  • Mike McCandless's [animation of Lucene segment merging].
    (Changing Bits: Visualizing Lucene's segment merges) is an interesting insight.

All-in-all distributed search on an actively changing dataset is challenging and requires a number of trade-offs.

1 Like