Confirm that I don't need `type: nested` on every node of my index mapping for the document tree?

I just have a question about the type: nested in the index mappings. Let's say I have documents where I want to aggregate on order.order_items.product.manufacturer.contact.name. Initially I made my mapping like this

{
  "properties": {
    "order_items": {
      "type": "nested",
      "properties": {
        "quantity": {
          "type": "long"
        },
        "product": {
          "type": "nested",
          "properties": {
            "name": {
              "type": "keyword"
            },
            "manufacturer": {
              "type": "nested",
              "properties": {
                "name": "{
                  "type": "keyword"
                },
                "contact": {
                  "type": "nested",
                  "properties": {
                    "name": {"type": "keyword"}
                  }
                }
              }
            }
          }
        }
      }
    }
    ...other fields...
  }
}

That is, I placed "type": "nested" on every node of the document tree. But through experimentation it seems like this was unnecessary. Seems like I only needed to put "type":"nested" only on the contact nested object. And if I only wanted to aggregate on order.order_items.product.name, then I only need to apply "type":"nested" on the product nested object. I just wanted to confirm my findings with someone who knows this stuff just to make sure I didn't misunderstand anything, or if there's other interesting insights I should be aware about.

What does a sample document look like?

If I do a GET order/_search, one of the hits could be :

        {
          "order_items": [
            {
              "product": {
                "name": "smart phone",
                "manufacturer": {
                  "name": "Acme Limited",
                  "phone": "111-111-1111",
                  "contact": {
                    "name": "George",
                    "phone": "555-555-5555"
                  }
                }
                "quantity": 1200
              }
            }
          ],
          "customer": {
            "zip": 36571,
            "geo": {
              "zip": 36571,
              "point": "POINT(-87.52 34.46)"
            },
            "address": "1095 Industrial Pkwy",
            "city": "Saraland",
            "state": "AL",
            "customer_id": 219,
            "birth_year": 1967
          }
        }

As far as I can tell, the only field that need to be nested is likely order_items as this has an array of subdocuments and you may want to filter on multiple fileds within order items.

Thanks! I actually did an experiment where I placed type: nested only on the order_items. But the type: nested didn't seem to have down stream effect on children that are also nested. It seemed in my experiments that type: nested has to be applied to the specific child you wish to aggregate on.

So that's why just wanted to confirm with you and others if I misunderstood something

What does this mean? It would help if you provided the full mapping as well as the query for this scenario.

Ok I simplified my question to this. My goal is to eventually aggregate on the field order.order_items.produce.name. This SCENARIO 1 works completely fine:

SCENARIO 1

DELETE order
PUT order

POST order/_mapping
{
  "properties": {
    "order_items": {
      "properties": {
        "product_id": {
          "type": "long"
        },
        "product": {
          "type": "nested",
          "properties": {
            "name": {
              "type": "keyword"
            },
            "product_id": {
              "type": "long"
            },
            "price": {
              "type": "long"
            }
          }
        }
      }
    }
  }
}

POST order/_bulk
{"index":{}}
{"order_items":[{"product":{"name":"book","price":10}},{"product":{"name":"pencil","price":1}}]}
{"index":{}}
{"order_items":[{"product":{"name":"pen","price":5}},{"product":{"name":"eraser","price":1}}]}


GET order/_search
{
  "size": 0,
  "aggs": {
    "order": {
      "nested": {
        "path": "order_items.product"
      },
      "aggs": {
        "product_name" : {
          "terms": {
            "field": "order_items.product.name"
          },
          "aggs": {
            "avg_price": {
              "avg": {
                "field": "order_items.product.price"
              }
            }
          }
        }
      }
    }
  }
}

The final GET order/_search {...} shows results like this

{
  ...other fields...
  "aggregations": {
    "order": {
      "doc_count": 4,
      "product_name": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [
          {
            "key": "book",
            "doc_count": 1,
            "avg_price": {
              "value": 10
            }
          },
          {
            "key": "eraser",
            "doc_count": 1,
            "avg_price": {
              "value": 1
            }
          },
          {
            "key": "pen",
            "doc_count": 1,
            "avg_price": {
              "value": 5
            }
          },
          {
            "key": "pencil",
            "doc_count": 1,
            "avg_price": {
              "value": 1
            }
          }
        ]
      }
    }
}

Now SCENARIO 2 below doesn't yield and results:

SCENARIO 2:

DELETE order
PUT order

POST order/_mapping
{
  "properties": {
    "order_items": {
      "type": "nested",
      "properties": {
        "product_id": {
          "type": "long"
        },
        "product": {
          "properties": {
            "name": {
              "type": "keyword"
            },
            "product_id": {
              "type": "long"
            },
            "price": {
              "type": "long"
            }
          }
        }
      }
    }
  }
}

POST order/_bulk
{"index":{}}
{"order_items":[{"product":{"name":"book","price":10}},{"product":{"name":"pencil","price":1}}]}
{"index":{}}
{"order_items":[{"product":{"name":"pen","price":5}},{"product":{"name":"eraser","price":1}}]}


GET order/_search
{
  "size": 0,
  "aggs": {
    "order": {
      "nested": {
        "path": "order_items.product"
      },
      "aggs": {
        "product_name" : {
          "terms": {
            "field": "order_items.product.name"
          },
          "aggs": {
            "avg_price": {
              "avg": {
                "field": "order_items.product.price"
              }
            }
          }
        }
      }
    }
  }
}

The final GET order/_search {...} shows zero results like this

{
  ... other fields ...
    "aggregations": {
    "order": {
      "doc_count": 0,
      "product_name": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": []
      }
    }
  }
}

The difference between Scenario 1 and Scenario 2 is simply the placement of the type: nested attribute in the mapping. Placing type: nested at the order_items level does not have any down stream effect to the product level. That's why in Scenario 2, placing type: nested at the order_items level does nothing for the GET order/_search in terms of aggregation, which yields zero results.

OH wait, I can also get results in the search query for scenario 2 if I change the nested path from aggs.order.nested.path: order_items.product to aggs.order.nested.path: order_items

GET order/_search
{
  "size": 0,
  "aggs": {
    "order": {
      "nested": {
        "path": "order_items.product"
      },
      "aggs": {
        "product_name" : {
          "terms": {
            "field": "order_items.product.name"
          },
          "aggs": {
            "avg_price": {
              "avg": {
                "field": "order_items.product.price"
              }
            }
          }
        }
      }
    }
  }
}

Ok, so it doesn't matter where you place the type: nested in the mapping as long as your specify a corresponding nested.path in the search query? I'll open a second question about this...

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.