I'm experiencing a problem with routing, which I can't figure out.
I have a server setup with a certain number of shards (512).
The data I store in Elasticsearch belongs to ~200 separate units, so that I'm using the strategy of assigning a numeric id to each unit (unique, incremental, and always smaller than the number of shards), and to store the data using as routing value the id of the unit.
For example, unit "foo" has id 1, while unit "bar" has id 98. Both ids match their respective routing value.
While storing the data, incrementally (unit by unit), everything was working fine for the first ~100 units (I don't know the exact number), but at some point, ES started reusing the shards, eg. data addressed by routing 98 is stored along with data addressed by routing 1.
This internal reusing strategy appears to be some sort of hashing, since it's consistent and deterministic - data with id 98 will always go in the shard with data with id 1, never with id 2.
Does anybody has any suggestion? I didn't find anything obvious in the documentation, and it obviously causes a serious problem, since I need complete compartmentalization of the data.