I am trying to do a large number of lookups and searches (in order a few million) based on document ids. Instead of sending requests to all shards I would like to partition document id's by specific shards they belong to and then call lookup/search with preference:_shard:x,y.
I am having trouble mapping document id to shard id, I have looked at https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/cluster/routing/OperationRouting.java and https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/cluster/routing/Murmur3HashFunction.java and
https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/cluster/metadata/IndexMetaData.java
private static int calculateScaledShardId(IndexMetaData indexMetaData, String effectiveRouting, int partitionOffset) {
final int hash = Murmur3HashFunction.hash(effectiveRouting) + partitionOffset;
// we don't use IMD#getNumberOfShards since the index might have been shrunk such that we need to use the size
// of original index to hash documents
return Math.floorMod(hash, indexMetaData.getRoutingNumShards()) / indexMetaData.getRoutingFactor();
}
I tried to reconstruct the shard id locally based on this logic, but doesn't seem to get the correct shard id, is there any easy way or am I missing something?