Drawbacks To Using Nested Documents?

I have an index containing millions of small documents (2-3kb each on average)- each representing a user in our system.

For each user I'd like to store actions performed by the user and the date that they performed the action. The only data I need for this is an event id (relating to an event in our data lake) and an event date. There could be thousands of these.

I'm looking at either storing this as two arrays- one for the id and one for the date- or nested documents. At this time I don't believe i'll need to associate the id to the event date in es directly- only check to see if an event occurred within a range, or an id exists (but not if an id happened on a particular date). This means I don't need nested document to associate these properties.

However, i'm wondering if it would still be beneficial to use nested documents here for simplicity and less reliance on scripts in queries.

What are the drawbacks of having potentially hundreds or thousands of very small nested documents for each document as apposed to two array properties?

Each nested level in a nested document gets indexed as a separate document behind the scenes and every time you update a document all these will need to be reindexed behind the scenes, which can get slow for large nested structures. Nested documents are therefore useful for usecases with frequent reads but rare updates.

1 Like

We do frequent updates on the top level user documents. So for the volume I'm talking about this would be very slow without a powerful cluster that could handle it. Thanks!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.