Hi, i have an index with 10 shards.
My application has 10 users.
I'd like to assign each shard to a specific user, avoiding the possible collisions in the hash function used in the formula that computes the shard number from the _routing parameter
Hi Thomas, thanks for your reply.
I also thought to use the routing(userId); the problem in this scenario is that the shard number is computed using a hash function, so i cannot be sure that the same shard is not assigned to 2 (or more) different users.
The preference parameter in the SearchAPI ,
specifying the value _shards:shardnumber, retrieves the documents from the shard shardnumber, but i cannot find anything similar in the DocumentAPI that inserts the document in the shard shardnumber
The biggest problem is the number of users and its growth in the future, and consequently the amount of resources that the separate indexes strategy would require
Having lots of small shards dedicated to specific users is unlikely to scale well whether they are directly linked to an index or part of larger indices the way you describe. If you have a small number of users I would recommend having separate indices per user. If you want this to scale to large number of users, I would recommend you reconsider having dedicated shards per customer.
I haven't found a way to send docs to a specific shard unfortunately. You pointed out two concerns, The biggest problem is the number of users and its growth in the future, and consequently the amount of resources that the separate indexes strategy would require One of the solutions @Christian_Dahlqvist mentioned may work well for you and solve both problems. If you used a shared index, you could have an index capable of supporting a scaling number of users with each users documents residing on a single shard. You could then use aliases to give the perception of a single index per user. You have scalability as well as less resources being used since Elasticsearch would not be sending requests out to all the shards in the cluster, just the one shard for your user. You can read the Definitive Guide section dedicated to shared indexes here.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.