Design approach for Many Small Sized but very different indices?

avacados · April 7, 2019, 2:10pm

With removal of 'type' in mapping from 6.x release, On Dynamic Application Development Platform Project, I required many indices need to be created. All Indices have documents which don't have many similarities.

For example,

Form of Application-1
Field - A (String)
Field - B (int)
Field - C (Date)

Form of Application - 2
Field - X (int)
Field - Y (int)
Field - Z (long)

There are as many as 50 Application per Tenant. It can scale up to 500 Tenant. So, Selected design approach can have 500 X 50 = 25000 Indices. However, Each indices/application might be very small memory size (i.e. KBs to Couple of MBs at Max)

I read the forum and it has been mostly suggested to keep dense data in minimum number of indices. But in my case, There are many models without no overlapping fields. So, I do see one option which is index per model (i.e. form of application in my use case)

My question : is it good design approach considering use case ? or better alternatives ?

Christian_Dahlqvist · April 7, 2019, 2:30pm

Different types of data can typically share an index as long as they do not have common fields with conflicting mappings. I would therefore therefore recommend reducing the index and shard count even if this means the number of fields go up. 25,000 indices with at least 50,000 shards is far beyond what is recommended for a cluster.

avacados · April 7, 2019, 7:36pm

Hm, In that case model will not looks clean. but reduce no of index. How ever, sharing fields of 50 models in single index can have approx 500-700 unique fields. is that still ok design ?

Previously, type within index was solving such problem, but now with > 5.x, such modelling is challenging.

Looking for best practices recommendation here.

Christian_Dahlqvist · April 7, 2019, 7:43pm

Elasticsearch has improved how sparse fileds are handled in recent versions, so I think having that number of fields per index is preferable to having 50,000+ shards. If tenants have the same models it may also make sense to have an index per model and have the tenants share this. You can add a field indicating tenant and filter on this in your application.

avacados · April 8, 2019, 6:16am

Since, Uniqueness/Scoping of field in my case is per Tenant/ per Application (i.e. Model), I don't have shared model across tenants or applications. As you pointed out, Sharing many unique fields in single index, certainly is better than having many indices considering performance/scale issue.

On another perspective, I was thinking to use "Nested Object" field type per model. Hence, I'll have 50 models fit in to 50 nested object field in single index. I'll have clear modeling within index. Do you see any disadvantage on this approach in indexing, querying or even in Scaling ? Thanks in Advance!

Christian_Dahlqvist · April 8, 2019, 6:19am

What would be the benefit of this approach?

system · May 6, 2019, 6:19am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Large number of indexes Elasticsearch	4	2910	July 5, 2017
Data modelling dilemma: Number of indices vs Sparse index Elasticsearch	10	1737	December 27, 2018
ElasticSearch : More indices vs More types Elasticsearch	3	592	February 10, 2018
Elasticsearch document modelling options Elasticsearch	3	424	December 20, 2016
Design of data structure: one big index vs many smaller indexes Elasticsearch	14	8103	March 8, 2019

Design approach for Many Small Sized but very different indices?

Related topics