I am planning to move my application on ElasticSearch where I will be creating multiple indexes and indexes might also contain N no. of documents with different size.
Is there any issues in creating large no. of indexes on ElasticSearch and if "no" but any performance concerns that I might face?
I have dynamic schema based application, where user will create following.
User will create portal
Under each portal user is allowed to create any type of schema consider like blogs, products, categories etc.
Each schema with N no. of documents.
Best solution was to create index as portal and schema as type under it, but the schema may changes over time which will results in the reindexing(expensive where I will have N no. of documents to be moved to the new index).
Instead portal map to index can we use schema of each portal as index? But my concerns are with it are large no. of indexes will be created on each portal. As, portal rises too many no. of indices will be created.
In that case I suppose you have to create one index per schema if absolutely nothing is common between users. Which means that users are solving totally different use cases with your application.
That is going to be a lot of shards then. You will have to monitor that carefully IMO (open file descriptors).
My recommandation: make absolutely sure that nothing can be shared between users.
Otherwise, put everything in the same index, then use filtered aliases to filter automatically by userid.
Not user focused, it's portal focused. Like portal will contain multiple schema's which will be created by different participants of portal.
Portal will be having large no of types with different mapping and there will be higher chances of schema
collision of field may happen(two or more schema might have common field name but different data type). Again this won't be supported by ElasticSearch as they use single field of same type for indexing inside Luscen index.
I am looking for the perfect architecture, in my case.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.