Storing documents with some arbitrary data

We need to store documents for our users. Each user will have a number of documents that all have the same core fields, we then allow users to add their own arbitrary fields to their documents.

Reading this article: https://www.elastic.co/guide/en/elasticsearch/guide/current/mapping.html. It says "Types are not as well suited for entirely different types of data." These documents won't all be entirely different types, does that mean it shouldn't be an issue?

Is there an accepted best practice when storing data that doesn't all have exactly the same format?

1 Like

Yeah, dynamic:false. That'll let you manually configure all the fields that
you want indexed and the other fields are just stored in the source. They
can be fetched but that is it.

Letting users create whatever fields they want is problematic for two
reason:

  1. Type clashes. You can resolve that by naming fields after their type so
    they done clash. Like phone_number_string. Sure, it is lame but you are
    working around inconsistent user data.
  2. Too many fields and it's corallary, sparse fields. This is trouble. Each
    field has an overhead in the index and in the mapping. Adding a field is a
    cluster state change event too. Sparse fields don't store well either in
    doc values.

Both of these things you can work around but they are certainly "advanced"
use cases.

1 Like