We want to use ElasticSearch as a multi-tenant store , each tenant would have different requirement for document type/schema.
What is the best way to store data wrt cost, manageability in this regard ?
1> Each tenant having separate index with varying document types may not be efficient?
2> A set of tenants may fall into one index with varying document types
but With ElasticSearch's removal of mapping types mentioned in link
It seems to be possible only through have custom type as mentioned in the link.
Please advise what is the best possible way to seperate tenant's data with each tenant having separate schema/document type requirement?
If you want to separate by customer then you will probably need to separate out documents that are not similar, perhaps you will need multiple indices per customer.
If you want to group by document similarity then that would be ok, you just need to manage multi-tenancy with something like Security.
The best solution is one that works for you out of those, they both have pros and cons.
There can be many because initially we will have lot of free customers. Its not possible as of now to quantify how much as this is the cloud service we are building
As mappings have to be consistent per index, you will need to impose some control on the content and mappings if you want tenants to shard indices. This is usually necessary as having an index per tenant scales badly. Having lots of small indices will result in performance problems.
There are no easy solutions, but I have seen users place controls on the data and have small users share indices and let a smaller number of larger users have their own.
I have seen users try going with one index per tenant and then deploy this across a lot of small clusters. This reduces the size of the cluster state per cluster but also does not necessarily scale well.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.