Aggregate data per document

JohnJoe · May 31, 2023, 1:50pm

Hi All,

I am wondering if the following is possible. I want to be able aggregate nested data within a document and then filter by the aggregated data.

So if we have

PUT warehouse/
{
  "mappings": {
  "properties": {
    "inventory": {
      "type": "nested",
      "properties": {
        "equipment": {
          "type": "keyword"
        },
        "price": {
          "type": "float"
        },
        "shopId": {
          "type": "keyword"
        }
      }
    },
    "profile": {
      "properties": {
        "name": {
          "type": "keyword"
        }
      }
    }
  }
  }
}

and then put data

PUT warehouse/_doc/1
{
  "profile": {
    "name": "Place1"
  },
  "inventory": [
    {"equipment":"guitar", "price": 1000.00, "shopId":"1"},
    {"equipment":"guitar", "price": 200.00, "shopId":"2"},
    {"equipment":"guitar", "price": 1.0, "shopId":"4"}
  ]
}

etc

I need to do filter by shopIds, say shopId 1 and shopId 2. Then aggregate that data per document, so for document above the average guitar price for shopId1 and shopId2 is 150.

I then want to only return documents and values that meed a criteria, so I say shopId1 + shopId2 AND average price >130.

I am able to get some aggregation working but it is aggregation across all the documents returned, not per document.

Hoping someone can help me!

JohnJoe

JohnJoe · May 31, 2023, 2:10pm

The following gives me the average price across all documents that match the search

GET warehouse/_search
{
  "query": {
    "nested": {
      "path": "inventory",
      "query": {
        "bool": {
              "should": [
                {
                  "term": {
                    "inventory.shopId": "1"
                  }
                },
                {
                  "term": {
                    "inventory.shopId": "2"
                  }
                }
              ]
            }
      }
    }
  },
  "aggs": {
    "inventory": {
      "nested": {
        "path": "inventory"
      },
      "aggs": {
        "priceAgg": {
          "filter": {
            "bool": {
              "should": [
                {
                  "term": {
                    "inventory.shopId": "1"
                  }
                },
                {
                  "term": {
                    "inventory.shopId": "2"
                  }
                }
              ]
            }
          },
          "aggs": {
            "avg_price": {
              "avg": {
                "field": "inventory.price"
              }
            }
          }
        }
      }
    }
  }
}

Result:

"aggregations" : {
    "inventory" : {
      "doc_count" : 9,
      "priceAgg" : {
        "doc_count" : 3,
        "avg_price" : {
          "value" : 2000.0
        }
      }
    }

But what I need is the price for the criteria per document

eMitch · June 13, 2023, 7:18pm

Hey @JohnJoe and welcome to the community!

In a nested structure, each child entry is its own document. In this way, it makes doing nested aggregations a little interesting.

I worry about the use of nested in this fashion if these lists are going to grow/change. Have you considered changing the index mapping to be "item" centric instead of the current "profile" centric?

This would give you an individual document for each inventory item and significantly more flexible queries for what you're looking to accomplish.

While it may seem more voluminous, under the hood, the number of documents will be similar, yet easier to manage and it will scale better as you grow the index.

system · July 11, 2023, 7:18pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Aggregate across the document from Nested Documents Elasticsearch	1	316	March 26, 2020
How to aggregate on filtered nested documents Elasticsearch	3	679	July 5, 2017
How to get average in nested document Elasticsearch	1	350	May 18, 2020
How to filter nested objects in nested aggregation Elasticsearch	3	5355	March 9, 2022
Aggregation on nested type Elasticsearch	1	302	July 6, 2017

Aggregate data per document

Related topics