Elasticsearch document modelling options

I would like to know the best practices when designing data model for indices in elasticsearch. We have a system in which we will need to pull data from cloud storage systems(eg:dropbox),social media(eg:twitter),articles from web etc.

The design issue am facing is that each type of docs have different fields/mapping.

Some of the options I've explored.

  1. Use different types under single index since it is having different doc structure(eg : elastic types facebook,twitter,drobpox,googledrive etc.).This will tend to add lot of types under an index.

  2. Use dynamic mapping for index to add fields whenever necessary. And use same mapping for all docs. In this case,most of the fields will be null.(eg: for a elastic document for storage social media specific fields will be null).

  3. Use different indices for different data points. In this case, there will be lots of indices.

I would like to know which of these options will be best in our use-case. My consideration is search and indexing performance and scalability.

I would go for indices, without having a deep knowledge of your usecase. E.g; Do you need parent/child relationships between your data from different domains?

With what I know; Since your data would have different fields, and thus different mapping, it will not be a good fit to squeeze them into the same index. What if you have a field in one type that maps to geoPoint, while in another data-type it maps to a plain string?

Since you talk about social media, I guess the amount of data will be great (if not for all types, but at least some). It will be easier to handle them separate with regards to sharding, archiving etc when split into different indices.

I also guess that the number of different data-types are not limitless, so that you will get a unmanageable amount of indices seems unlikely.

1 Like

@rmy : Thank you for your suggestion. I will try this out.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.