Custom routing of shard number


(ganeshbabu) #1

Hi All,

I am trying to use custom routing to index some document to a particular shard. Below is the link I referred,

https://www.elastic.co/guide/en/elasticsearch/guide/current/routing-value.html#routing-value

Sample doc:-

POST es_item/item/3214440?routing=50
{
"ITEM_ID": 3214440,
"ITEM_CODE": "9588049",
"ITEM_DSCR": "HOME FURNISHINGS & DECOR",
"ITEM_SPECIFICITY_REF_ID": 186,
"ITEM_TYPE": "SGI",
"RELATIONSHIP": [
{
"REL_TYP_REF_ID": -999,
"REL_TYPE": "NO RELATIONSHIP",
"REL_CTGRY": "NIL"
}
],
"ITEM_MISUSED_GTIN_FLG": "N",
"CRT_DTTM": "2004-09-25 22:00:00",
"UPD_DTTM": "2014-06-01 04:01:30",
"DIST": [
{
"RGN_ID": 5,
"RGN_NM": "AT",
"DSTN_STRT_DT": "2011-03-03 21:33:51",
"DSTN_END_DT": "9999-12-31 00:00:00",
"ITEM_GLBL_CODE_ST_REF_ID": 721,
"PRE_MOVEMENT_DT": null
}
],
"ICV": {
"C37": 18136791
},
"XCD": []
}

I am setting a custom routing value as "50" while indexing when I search the document it was indexed in shard 6

By Using this formula I am not able to find the shard number,

shard = hash(routing) % number_of_primary_shards

number of primary shards value: - 18
routing=10

Can any one help me how to find the shard number using the formula?
(or)
Is the right way to do custom routing in elasticsearch?

Please let us know your suggestions.

Thanks,
Ganeshbabu R


(Mark Harwood) #2

Any discrepancy probably comes down to your choice of hashing algo.

Why do you need to discover this? There may be a better way to achieve your end goal.


(ganeshbabu) #3

Hi @Mark_Harwood

Our ES cluster has 3 master nodes, 3 data nodes, 1 client nodes.
es_item index are allocated 18 shards, 2 replica

In es_item index some of the items were having a cross codes of nearly 10 millions to 20 millions of docs. As we are planning to use custom routing to index the documents Suppose if some of the items were present in the same shard then all the cross codes will add to that shard where the parent document were located so It might shard become unbalanced and we might face some performance related issue (like searching, indexing).

So we plan to take a top 10 items of having maximum cross codes and index each item to the each shard. So, we trying to use custom routing to avoid that. I tried it in local but I couldn't understand the hash function in that formula.

Can you tell me how to find the shard number?

Please let us know your suggestions.

Thanks,
Ganeshbabu R


(Mark Walkom) #4

That's what we said over here Movement of document from one shard to another shard


(system) #5