Aggregating an index with parent-child runs forever

Hi,
I've recently decided to make an attempt to reindex an existing denormalized index to a new index with parent-chid relation.
I've around 14M parent docs, each parent has up to 400 children. (total of around 270M docs)
This is a simplified version of my mapping ->

{
  "mappings": {
    "_doc": {
      "properties": {
        "product_type": {
          "type": "keyword"
        },
        "relation_type": {
          "type": "join",
          "eager_global_ordinals": true,
          "relations": {
            "product_data": [
              "kpi",
              "customer"
            ]
          }
        },
        "rootdomain": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "rootdomain_sku": {
          "type": "keyword",
          "eager_global_ordinals": true
        },
        "sales_1d": {
          "type": "float"
        },
        "sku": {
          "type": "keyword",
          "eager_global_ordinals": true
        },
        "timestamp": {
          "type": "date",
          "format": "strict_date_optional_time_nanos"
        }
      }
    }
  }
}


As you can see I've used eager_global_ordinals for the join relation to speed up search performance
(per my understanding this causes some of the join relation computation in global ordinals to be done in indexing time and not while querying).
This migration process helped me reduce my index size from around 500GB to just 40GB.
It has a huge benefit for my use case since I update a lot of data daily.

My current testing environment is using a single node, and the index has only 1 primary shard.
Trying to run the following aggregation, seems like it runs forever -

{
  "aggs": {
    "skus_sales": {
      "aggs": {
        "sales1": {
          "children": {
            "type": "kpi"
          },
          "aggs": {
            "sales2": {
              "filter": {
                "range": {
                  "timestamp": {
                    "format": "basic_date_time_no_millis",
                    "gte": "20220601T000000Z",
                    "lte": "20220605T235959Z"
                  }
                }
              },
              "aggs": {
                "sales3": {
                  "sum": {
                    "field": "sales_1d"
                  }
                }
              }
            }
          }
        }
      },
      "terms": {
        "field": "rootdomain_sku",
        "size": 10
      }
    }
  },
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "rootdomain.keyword": "some_domain"
          }
        },
        {
          "term": {
            "product_type": "Rugs"
          }
        }
      ]
    }
  }
}

I understand the cons of parent-child relations, but it seems like I'm doing something wrong.
I would expect to get some result, even after 15 minutes, but it seems to run forever.

I would love to get some help here,
Thanks.

Have a look at the hot threads API, which might give some clues about what is going on. Be aware that each query executes in a single thread against each shard, so by increasing the number of primary shards you may do more work and processing in parallel. It is possible that this is your bottleneck. I would split the index into e.g. 4 primary shards and see how that affects query latency.

Thanks, that might be the issue, seems like response time is dramatically high, so i'm trying to migrate this index to use nested docs instead of parent-child relation.

Often the best way is to denormalise and fully flatten the structure when working with Elasticsearch so I would recommend looking into that as an option. Nested documents also come with some performance implications, as updates or changes can get very expensive.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.