Can I add document data in aggregated group?

I'm totally new to Elasticsearch, And I'm now logging our product view log to serve "popular product" to client side.

Our logic is below....

  1. User clicks product
  2. Client request product data to server
  3. When serving product data to client, Server index product data and timestamp to Elasticsearch

For serve "popular product", I'm using "aggregations query" for calculate doc counts.
Query is below

# POST /product_log/_search
# Request

{
   "size": 0,
   "query": {
      "range": {
         "ts": {
            "gt": "2020-01-01 00:00:00",
            "lt": "2022-01-01 00:00:00"
         }
      }
   },
   "aggs": {
      "group_by_state": {
         "terms": {
            "field": "product.ID.keyword",
            "size": 20
         }
      }
   }
}

It will return data as below

# Post /product_log/_search
# Response

{
  "took": 20,
  "timed_out": false,
  ....
  "aggregations": {
    "group_by_state": {
    "doc_count_error_upper_bound": 0,
    "sum_other_doc_count": 19773,
    "buckets": [
      {
      "key": "623b26686e2eb9b0c91b3574",
      "doc_count": 338
      },
      {
      "key": "625e64326ae269dfd72012cc",
      "doc_count": 331
      } ...  * 20 elements
    ]
  }

Through this query we can recognize that productId '623b26686e2eb9b0c91b3574' is most popular product and '625e64326ae269dfd72012cc' is second and so on...........

But for serving "product data list" of these popular product, eventually we need to query to our database again like this code

# pseudo code

# first query for elastic search
response = es.product.getPopularProducts()

productIdList = getProductIdListFromBucket(response.aggregations.buckets)

# second query for database
popularProducts = db.product.find(productId: { &in: productIdList})

Because this logic requires to communicate with 2 another external sever (1 for Elasticsearch, 1 for database), I think this logic is a bit inefficient,

So I want to find a way to integration "document data" to response of Elasticsearch's aggregation query.

Something like this -->

# Post /product_log/_search
# Response

{
  "took": 20,
  "timed_out": false,
  ....
  "aggregations": {
    "group_by_state": {
    "doc_count_error_upper_bound": 0,
    "sum_other_doc_count": 19773,
    "buckets": [
      {
      "key": "623b26686e2eb9b0c91b3574",
      "doc_count": 338
      "product_document": {
        "product": {
          "ID": "623b26686e2eb9b0c91b3574",
          "price": 500,
          "sku": 4,
          "name": "good product"
         },
        "timestamp": 2022-04-29 13:40:00
      },
      {
      "key": "625e64326ae269dfd72012cc",
      "doc_count": 331,
      "product_document": {
        "product": {
          "ID": "625e64326ae269dfd72012cc",
          "price": 400,
          "sku": 20,
          "name": "not bad product"
         },
        "timestamp": 2022-04-29 13:40:00
      },
      } ...  * 20 elements
    ]
  }

Surely because all of these documents has product detail data, so each elements of bucket has lists of "product document". But all of the product detail data groupped by product.ID.
All of documents in one bucket element is totally same. So I need to get only one latest product detail document.

I think mongoDB supports this kind of query (&&root) like below.
It will add last document of the group that has same productId to aggregation group.

# pseudo query body I want to using
  {
     "size": 0,
     "query": {
        "range": {
           "ts": {
              "gt": "2020-01-01 00:00:00",
              "lt": "2022-01-01 00:00:00"
           }
        }
     },
     "aggs": {
        "group_by_state": {
           "terms": {
              "field": "product.ID.keyword",
              "size": 20
           }
         "product_document": {&last: '&&ROOT'}
        }
     }
  }

Elasticsearch also support this kind of query method?


I'm using

Please help me!!

Check out the top hits aggregation. See Top hits aggregation | Elasticsearch Guide [8.1] | Elastic

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.