Hello all,
I need advice to find a good design for my documents. The data I want to ingest comes from a big, recursively nested XML file (I could translate it into JSON fwiw) which looks similar to this:
<Data>
<Element type="a" value="abc">
<Element type="b" value="foo">
<Element type="c" value="bar"></Element>
</Element>
</Element>
<Element type="a" value="def">
</Element>
</Data>
- Apart from the global Data root tag, there are only Element tags.
- Apart from a few select Elements, every Element has exactly one parent. (The exceptions are the Elements with type="a")
- The individual Elements can be arbitrarily deeply nested (but bounded in all cases)
- The nesting depth is usually between 5 and 20, there is a minimum at 4.
- There are no cycles.
I want to do aggregations and searches on the individual Elements, so every Element tag should become a distinct document. However, there are two types of queries that are neccesary:
- Find (recursively) all parents of a given Element
- Search and filter on all (recursive) children of a given Element
From what I understand, join is not possible here, because a single Element can be parent and child at the same time. Maybe nested is the correct way to go, but I'd like feedback before I reimplement what I have so far. (which is: Add a primary_key and parent_key field to all Elements, but this does not support recursive search).