Elasticsearch document modelling options

Sachin_Shaju · November 8, 2016, 11:55am

I would like to know the best practices when designing data model for indices in elasticsearch. We have a system in which we will need to pull data from cloud storage systems(eg:dropbox),social media(eg:twitter),articles from web etc.

The design issue am facing is that each type of docs have different fields/mapping.

Some of the options I've explored.

Use different types under single index since it is having different doc structure(eg : elastic types facebook,twitter,drobpox,googledrive etc.).This will tend to add lot of types under an index.
Use dynamic mapping for index to add fields whenever necessary. And use same mapping for all docs. In this case,most of the fields will be null.(eg: for a elastic document for storage social media specific fields will be null).
Use different indices for different data points. In this case, there will be lots of indices.

I would like to know which of these options will be best in our use-case. My consideration is search and indexing performance and scalability.

rmy · November 8, 2016, 1:02pm

I would go for indices, without having a deep knowledge of your usecase. E.g; Do you need parent/child relationships between your data from different domains?

With what I know; Since your data would have different fields, and thus different mapping, it will not be a good fit to squeeze them into the same index. What if you have a field in one type that maps to geoPoint, while in another data-type it maps to a plain string?

Since you talk about social media, I guess the amount of data will be great (if not for all types, but at least some). It will be easier to handle them separate with regards to sharding, archiving etc when split into different indices.

I also guess that the number of different data-types are not limitless, so that you will get a unmanageable amount of indices seems unlikely.

Sachin_Shaju · November 22, 2016, 6:45am

@rmy : Thank you for your suggestion. I will try this out.

system · December 20, 2016, 6:46am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Index with different document types Elasticsearch	4	576	September 28, 2023
Best practices for mappings (if they don't differ) Elasticsearch	5	3902	July 5, 2017
Design approach for Many Small Sized but very different indices? Elasticsearch	6	491	May 6, 2019
Alternative for mapping type Elasticsearch	6	466	September 29, 2020
Design structure for similar mappings with small data type differences Elasticsearch	2	587	June 16, 2017

Elasticsearch document modelling options

Related topics