The documentation in ES version 6.8 (Terms Query | Elasticsearch Guide [6.8] | Elastic) eluded to an approach to improve performance for terms lookups by searching for terms on a local node that has a fully replicated index as opposed to the index being sharded and distributed across nodes:
"Also, consider using an index with a single shard and fully replicated across all nodes if the "reference" terms data is not large. The lookup terms filter will prefer to execute the get request on a local node if possible, reducing the need for networking."
This statement was removed in the latest documentation: Terms query | Elasticsearch Guide [7.16] | Elastic.
(PR here: [DOCS] Rewrite `terms` query by jrodewig · Pull Request #42889 · elastic/elasticsearch · GitHub)
Was this statement removed for a reason or is there a change in configuration approach, if so what configuration is now required to support this setup? From my understanding of the current configuration and documentation, setting the indexes primary shard size to 1 will ensure all nodes have a fully replicated index to search against and all term lookups will be performed against a local copy of the entire index that doesn't require a need for additional networking.
Have I understood this correctly, and is there any other configuration recommendations to support indexes containing large sets of terms lookups at scale?
Many thanks in advance!
Rich