How should I denormalize my data?

Hi all,

I'm new to elasticsearch and having trouble to come up with a data model for the index (or maybe I shall create multiple indices?).

My data is hierarchical as I have grandparent-parent-child-grandchildren. As a first step top level could be avoided and have the hierarchy only 3 levels deep. Each level is 1-n while n very often is 1 but not always. Can also be 50 or so in rare cases.

I thought about denormalizing the first 2 levels and putting the 3rd level into arrays. The 3rd level are the actual "measurement" data which can be numeric or text. But users will for sure search on this. A problem is that each such point has multiple fields / metadata besides the actual value like entry date, comments, etc and these fields can be empty.
If 1 element in an array matches it's ok to show the user the whole document with all the user values.

Is element order in arrays ensured? So what I put in first, is returned first? (Documentation for version 2 says yes, but same documentation for 7.1 doesn't mention this at all). What happens to empty / null values? As far as I understood they get trimmed away? So [null, 1, 2] becomes [1,2]. Is that correct? Given my above approach that would be a problem because [1,null,2] would also end up as [1,2] and if there are 3 data points, it's not possible to know which index was the null value.

What are other design options? I could denormalize completely but how to I then group/aggregate the results correctly over the 3 levels so that user sees 1 result per parent?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.