I'm currently using ElasticSearch and storing all the data in one index for all my users.
The data sent by each user is by nature different for each user which makes my index fields number growing rapidly and exceeding 2000 fields. I'm thinking about switching to a multi-tenant approach where each user will have its own index but I'm not sure about what drawbacks can I face once having 10k indexes and if in the future it reaches 100k indexes.
Does elastic search support such a large number of indexes? Is the multi-tenant approach better than the single index even if it can reach 100k indexes?
Having an index per tenant works well when the number of tenants is reasonably low as each index and shard comes with overhead in terms of heap usage and cluster state size. Having lots of very small indices and shards is very inefficient and does not scale well. For this type of scenarios it is common to group users into indices and perhaps let very large users have their own indices.
Appreciate your response @Christian_Dahlqvist, and your comments made in other threads about this topic (and others). It's been very helpful for both @BentoumiTech and I.
A follow on question; are there a few good rules of thumbs (or talks to listen to) when trying to balance indices, shards, heap usage, and cluster size? We ran into a fairy hairy problem this week which has caused us to start a very large index from scratch and we want to do it correctly this time (event analytics data).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.