Heap usage issue

Hi!

I've got an issue with one visualization. There's a high cardinality field (HCF) I need to check once in a while. My visualization (line chart) shows the TOP 10 instances by unique count. When I run it on the past hour, one of the nodes hits the heap memory limit (circuit breaker) and the shards on that node goes unassigned.

None of the other instances flinch at at all. When I do the same on another cluster the "load" seems to be distributed. The two clusters are running almost identical setup. The one with the issue described above is running on a newer env (OpenJDK 11). I can't put my finger on the problem here.
Everything else runs smoothly, CPU utilization is between 20-30%, heap around 50%, average load around 1, no other visualizations, dashboards, complex aggreagations have similar effect on the cluster.
Is there a way to prevent this behavior (10GB spike in heap)?

Specs:
ES version: 7.2.1
Cluster has 6 data nodes each with

  • 55GB RAM/25GB heap
  • 8 core (2.3 GHz)

Index in question (at query time):

  • 100M+ documents
  • 10 shards (5 primary, 5 replica)
  • size 73 GB+
  • unique instances ~20K
  • HCF 500K+ unique values

Thank you,
YvorL

What is the configuration of the dashboard that is experiencing these problems?

I don't even need a dashboard. The line visualization is set to show the unique count of the field with date histogram (auto), split series by instance name. When I set the time interval to 1h it crashes. Since it's a production env, I did only test this about three times in the past 3 days.

What is the configuration of that line visualisation?

Data

  • No filters
  • Metrics: Unique count of the high cardinality field
  • Buckets: Date histogram (auto interval) -> Split series by instance name (size:10)
  • Time interval: last 1 hour

Panel settings:

  • Legend position bottom
  • Grid show X-axis lines

I think everything else is default Kibana (7.2.1) setting.

@Christian_Dahlqvist

If that wasn't what you meant by configuration, please forgive me. As you see I'm trying to provide as much information as I can.

I'm sorry if I look impatient, but @Christian_Dahlqvist reacted on a Sunday within an hour and nothing since that. Is there anyone who's able to help me debugging the issue I have? I do need a solution to this and currently, it only looks like if someone was dealing with the question while in reality, nothing happened.

I do unfortunately not have time to look into this at the moment. Note that this forum is manned by volunteers and there are no SLAs or guarantees of a response or resolution.

Thank you, I do appreciate your (as in plural) time!