Note: The following is written assuming you are hitting OOM with the exact query you provided and without any knowledge of your ES version, current heap size, number of nodes, number of shards per daily index, or range of dates you are searching over. I have tried to show as much workings as possible so you can adapt things if some/all of my assumptions are not true.
To explain a little bit of why this aggregation might be taking up so much memory the following is a rough calculation of the memory the cardinality aggregation will require (note that the terms aggregation and date_histogram aggregation will also have a memory footprint but I am ignoring that for now):
Memory used by the cardinality aggregation for a single bucket: 8 * precision_threshold bytes = 8 * 30000 bytes = 240,000 bytes = 234.375 kB
The histogram aggregation is asking for D days of data and you are returning 5 versions for each day, so the memory required to caculate the cardinalty across all the buckets is:
D * 5 * 234.375 kB = D * 1.144MB.
This is the memory footprint of the result of the cardinality aggregation returned by each shard. So if you have S shards per daily index, the client node will need:
(S + 1) * D * 1.144MB
Note: Since your indices are daily each shard should only return a single date histogram bucket to the client node. The extra 1 is for the resulting data structure after the shard results are reduced into the final result.
So if S is 5 (the default) and D is 365 the cardinality aggregation will require ~2GB of memory across all the buckets. If you have a small heap size set on your nodes (e.g. 4GB) I could see this causing an OOM especially if these queries are being run concurrently by different users.
The terms aggregation does have a collect_mode option and an execution_hint option which are designed to reduce the memory usage in particular situations but unfortunately they will not help here as they are options for reducing memory usage on the shard collection stage and not the reduce stage.
The options I see is to do one (or more) of:
- Limit the number of buckets that can be produced by the request; e.g. limit your request to less days worth of data.
- Limit the memory usage of the cardinality aggregation in each individual bucket by reducing the precision_threshold. This will use less memory per bucket at the cost of some precision.
- Increase ES_HEAP_SIZE on your nodes to accomodate the memory pressure. Note that if you are already running nodes with 30GB heaps you will probably need to start more client nodes and spread the incoming requests among them
Hope that helps