Get the entire sub-tree / tree structure for a document based on a field

I have a hierarchical structure (much like categories contained within categories) in my document like following:

                   1(parent for root category is: -1)
                  /  \
                2     3
               /        \
             4           5
            /
           6

I have 2 separate fields in the mapping to express this relationship: parent and path as shown below:

document for "1"

{
  "id": "1",
  "parent": "-1",
  "path": "-1"  
}

document for "2"

{
  "id": "2",
  "parent": "1",
  "path": "/1/2"  
}

document for "4"

{
  "id": "4",
  "parent": "2",
  "path": "/1/2/4"  
}

and so on.

Here's the whole request to create this structure in index called "tree"

PUT tree

POST _bulk
{ "create" : { "_index" : "tree"}}
{ "id": "1", "parent": "-1", "path": "/1" }
{ "create" : { "_index" : "tree"}}
{ "id": "2", "parent": "1", "path": "/1/2" }
{ "create" : { "_index" : "tree"}}
{ "id": "3", "parent": "1", "path": "/1/3" }
{ "create" : { "_index" : "tree"}}
{ "id": "4", "parent": "2", "path": "/1/2/4" }
{ "create" : { "_index" : "tree"}}
{ "id": "6", "parent": "4", "path": "/1/2/4/6" }
{ "create" : { "_index" : "tree"}}
{ "id": "5", "parent": "3", "path": "/1/3/5" }

Now, I want to query and print the entire tree (note: I do not know the depth in advance) which should print the result like following (similar to top-navigation menu in ecom sites, for eg. Mens > Tops > Shirts > Collared shirt i.e. get the whole tree):

1
   2
      4
         6
  3
     5

So, i tried an aggregate with top_hits like following:

"aggs": {
    "sub-cat-tree": {
      "terms": {
        "field": "parent.keyword",
        "size": "30"
      },
      "aggs": {
        "tops": {
          "top_hits": {
            "size": 20
          }
        }
      }
    }
  }
}

But this did not print the tree within the tree, instead it grouped by all parents.
I think I need a nested aggs, but I'm not sure how, as I do not know depth in advance.
Any help would be appreciated. Thanks.

Welcome!

I'd look at the Path hierarchy tokenizer | Elasticsearch Guide [7.16] | Elastic for this use case.
Did you look at it yet?

May be that would help? Unsure though. :slight_smile:

Thank you for your reply.

I did look at path hierarchy tokenizer, infact, when I look the _mapping of my index, it uses the path hierarchy tokenizer, but I'm not sure how to use it to fetch the entire tree.
I also tried using the collapse unsuccessfully.
I think closest I've come to the solution is the query I posted on my question itself i.e using top_hits, but that gives a result like following:

1
    2
    3
2
    4
3
    5
4
    6

instead of the entire hierarchy (within hierarchy). I can use the this result, but my Application will have to do the work to traverse the result set and build the entire hierarchy, I was really hoping this was possible to do using query itself in Elastic.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.