Hey! I have an issue with performance on a shared K8s cluster. When there is no load on the cluster, performance is great but for reason when activity increases my ES performance declines a lot.
I have 20 indices, ~ 1GB of data, two nodes. 1000m CPU and 2Gi memory. Deployed with Elastic helm chart.
Health:
{
"cluster_name" : "elasticsearch",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 2,
"active_primary_shards" : 90,
"active_shards" : 180,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
I am not sure how to properly debug why the performance is declining. Could it be networking bottleneck? The K8s cluster when under load is using < 10% CPU and < 30% memory, although requested is ~ 90% and 70%.
Tips on debugging performance? Thanks a lot.