Diagnosing slow dashboard/Viz performance in elastic cloud

johnwood · April 14, 2021, 10:57pm

Hi - I am looking for some pointers or starting places to diagnose our slow dashboards.

If I look at a visualization (they are all 'slow' to render in kibana) by going to inspect>request>statistics I see two timing numbers:

One (unlabeled) in green on the right-hand side that shows 12991ms (12 seconds) and the tooltip text says it is the total time the request took
The other is the query time of 983ms (1 second) with a tool tip saying it does not include time to send the request or parse in the browser.

I am keen to understand the difference between these so I can start to work out where and why we are spending these 11 seconds.

Interestingly, the bigger one does not match the (even longer) manual timings for the viz to render in the browser so I am guessing this number is not 'RUM' type measures and don't include user network and browser rendering?

A screenshot is attached.

Any guidance is much appreciated.

Thanks

John

warkolm · April 14, 2021, 11:00pm

Kibana is usually slow cause Elasticsearch is slow.

What is the output from the _cluster/stats?pretty&human API in Elasticsearch?
What version of the stack are you on?

wylie · April 14, 2021, 11:20pm

@johnwood I have been investigating performance issues in Kibana dashboards for a few days now. I agree with @warkolm that you are probably seeing latency from Elasticsearch, but dashboards are frequently very slow because of Kibana overhead.

One thing that could be happening is that when you load the dashboard it will trigger many sequential requests to ES, and then ES will queue your requests. This can explain the timing differences between the time it takes ES to run the query vs the overall timing.

Here is some generic advice:

Dashboards that have many visualizations will be slower
Large index patterns are slower (thousands of fields). If you aren't using all the fields, you should delete the mappings.
Some visualization types make requests to Kibana's server. If your Kibana instance has limited CPU or memory, it could slow down those visualizations.
If the machine you're viewing Kibana on is slower, it might have slower performance when rendering to your screen (there's a lot of JS)
Newer versions of Kibana are faster, we've reduced a lot of overhead here

I do think it's more likely that you're seeing ES bottlenecks, but if you think there are Kibana performance issues then I am interested in getting a dump of your entire dashboard from the Kibana API.

johnwood · April 16, 2021, 8:32am

Thanks.

We are on cloud and version 7.9.0

The output is

{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "f067d117ca114d09a25ea368fb74ae01",
  "cluster_uuid" : "1Gw9rCqASLOPD19jJ76quQ",
  "timestamp" : 1618561614840,
  "status" : "yellow",
  "indices" : {
    "count" : 564,
    "shards" : {
      "total" : 564,
      "primaries" : 564,
      "replication" : 0.0,
      "index" : {
        "shards" : {
          "min" : 1,
          "max" : 1,
          "avg" : 1.0
        },
        "primaries" : {
          "min" : 1,
          "max" : 1,
          "avg" : 1.0
        },
        "replication" : {
          "min" : 0.0,
          "max" : 0.0,
          "avg" : 0.0
        }
      }
    },
    "docs" : {
      "count" : 22193802,
      "deleted" : 1928079
    },
    "store" : {
      "size" : "30.5gb",
      "size_in_bytes" : 32843042120,
      "reserved" : "0b",
      "reserved_in_bytes" : 0
    },
    "fielddata" : {
      "memory_size" : "2mb",
      "memory_size_in_bytes" : 2197992,
      "evictions" : 0
    },
    "query_cache" : {
      "memory_size" : "22.8mb",
      "memory_size_in_bytes" : 23933985,
      "total_count" : 644306888,
      "hit_count" : 104247575,
      "miss_count" : 540059313,
      "cache_size" : 23749,
      "cache_count" : 152546,
      "evictions" : 128797
    },
    "completion" : {
      "size" : "0b",
      "size_in_bytes" : 0
    },
    "segments" : {
      "count" : 3622,
      "memory" : "41.7mb",
      "memory_in_bytes" : 43829696,
      "terms_memory" : "26.6mb",
      "terms_memory_in_bytes" : 27975368,
      "stored_fields_memory" : "1.8mb",
      "stored_fields_memory_in_bytes" : 1935664,
      "term_vectors_memory" : "0b",
      "term_vectors_memory_in_bytes" : 0,
      "norms_memory" : "2.4mb",
      "norms_memory_in_bytes" : 2544384,
      "points_memory" : "0b",
      "points_memory_in_bytes" : 0,
      "doc_values_memory" : "10.8mb",
      "doc_values_memory_in_bytes" : 11374280,
      "index_writer_memory" : "15mb",
      "index_writer_memory_in_bytes" : 15768578,
      "version_map_memory" : "393.3kb",
      "version_map_memory_in_bytes" : 402753,
      "fixed_bit_set" : "1.7mb",
      "fixed_bit_set_memory_in_bytes" : 1814928,
      "max_unsafe_auto_id_timestamp" : 1608113712346,
      "file_sizes" : { }
    },
    "mappings" : {
      "field_types" : [
        {
          "name" : "binary",
          "count" : 16,
          "index_count" : 3
        },
        {
          "name" : "boolean",
          "count" : 154,
          "index_count" : 27
        },
        {
          "name" : "byte",
          "count" : 106,
          "index_count" : 106
        },
        {
          "name" : "date",
          "count" : 1459,
          "index_count" : 560
        },
        {
          "name" : "flattened",
          "count" : 10,
          "index_count" : 2
        },
        {
          "name" : "float",
          "count" : 2002,
          "index_count" : 409
        },
        {
          "name" : "geo_point",
          "count" : 413,
          "index_count" : 413
        },
        {
          "name" : "geo_shape",
          "count" : 3,
          "index_count" : 3
        },
        {
          "name" : "half_float",
          "count" : 850,
          "index_count" : 419
        },
        {
          "name" : "integer",
          "count" : 177,
          "index_count" : 14
        },
        {
          "name" : "ip",
          "count" : 826,
          "index_count" : 413
        },
        {
          "name" : "keyword",
          "count" : 16726,
          "index_count" : 561
        },
        {
          "name" : "long",
          "count" : 11103,
          "index_count" : 549
        },
        {
          "name" : "nested",
          "count" : 52,
          "index_count" : 18
        },
        {
          "name" : "object",
          "count" : 1992,
          "index_count" : 560
        },
        {
          "name" : "short",
          "count" : 213,
          "index_count" : 107
        },
        {
          "name" : "text",
          "count" : 7392,
          "index_count" : 555
        }
      ]
    },
    "analysis" : {
      "char_filter_types" : [ ],
      "tokenizer_types" : [ ],
      "filter_types" : [
        {
          "name" : "pattern_capture",
          "count" : 1,
          "index_count" : 1
        }
      ],
      "analyzer_types" : [
        {
          "name" : "custom",
          "count" : 1,
          "index_count" : 1
        }
      ],
      "built_in_char_filters" : [ ],
      "built_in_tokenizers" : [
        {
          "name" : "uax_url_email",
          "count" : 1,
          "index_count" : 1
        }
      ],
      "built_in_filters" : [
        {
          "name" : "lowercase",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "unique",
          "count" : 1,
          "index_count" : 1
        }
      ],
      "built_in_analyzers" : [ ]
    }
  },
  "nodes" : {
    "count" : {
      "total" : 1,
      "coordinating_only" : 0,
      "data" : 1,
      "ingest" : 1,
      "master" : 1,
      "ml" : 0,
      "remote_cluster_client" : 1,
      "transform" : 1,
      "voting_only" : 0
    },
    "versions" : [
      "7.9.0"
    ],
    "os" : {
      "available_processors" : 32,
      "allocated_processors" : 3,
      "names" : [
        {
          "name" : "Linux",
          "count" : 1
        }
      ],
      "pretty_names" : [
        {
          "pretty_name" : "CentOS Linux 7 (Core)",
          "count" : 1
        }
      ],
      "mem" : {
        "total" : "15gb",
        "total_in_bytes" : 16106127360,
        "free" : "895.3mb",
        "free_in_bytes" : 938823680,
        "used" : "14.1gb",
        "used_in_bytes" : 15167303680,
        "free_percent" : 6,
        "used_percent" : 94
      }
    },
    "process" : {
      "cpu" : {
        "percent" : 2
      },
      "open_file_descriptors" : {
        "min" : 4580,
        "max" : 4580,
        "avg" : 4580
      }
    },
    "jvm" : {
      "max_uptime" : "120.9d",
      "max_uptime_in_millis" : 10448412477,
      "versions" : [
        {
          "version" : "14.0.1",
          "vm_name" : "OpenJDK 64-Bit Server VM",
          "vm_version" : "14.0.1+7",
          "vm_vendor" : "AdoptOpenJDK",
          "bundled_jdk" : true,
          "using_bundled_jdk" : true,
          "count" : 1
        }
      ],
      "mem" : {
        "heap_used" : "5.5gb",
        "heap_used_in_bytes" : 5970140648,
        "heap_max" : "7.5gb",
        "heap_max_in_bytes" : 8053063680
      },
      "threads" : 201
    },
    "fs" : {
      "total" : "480gb",
      "total_in_bytes" : 515396075520,
      "free" : "449gb",
      "free_in_bytes" : 482196094976,
      "available" : "449gb",
      "available_in_bytes" : 482196094976
    },
    "plugins" : [
      {
        "name" : "repository-s3",
        "version" : "7.9.0",
        "elasticsearch_version" : "7.9.0",
        "java_version" : "1.8",
        "description" : "The S3 repository plugin adds S3 repositories",
        "classname" : "org.elasticsearch.repositories.s3.S3RepositoryPlugin",
        "extended_plugins" : [ ],
        "has_native_controller" : false
      }
    ],
    "network_types" : {
      "transport_types" : {
        "security4" : 1
      },
      "http_types" : {
        "security4" : 1
      }
    },
    "discovery_types" : {
      "zen" : 1
    },
    "packaging_types" : [
      {
        "flavor" : "default",
        "type" : "docker",
        "count" : 1
      }
    ],
    "ingest" : {
      "number_of_pipelines" : 14,
      "processor_stats" : {
        "conditional" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time" : "0s",
          "time_in_millis" : 0
        },
        "geoip" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time" : "0s",
          "time_in_millis" : 0
        },
        "grok" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time" : "0s",
          "time_in_millis" : 0
        },
        "gsub" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time" : "0s",
          "time_in_millis" : 0
        },
        "remove" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time" : "0s",
          "time_in_millis" : 0
        },
        "rename" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time" : "0s",
          "time_in_millis" : 0
        },
        "script" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time" : "0s",
          "time_in_millis" : 0
        },
        "set" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time" : "0s",
          "time_in_millis" : 0
        }
      }
    }
  }
}

johnwood · April 16, 2021, 8:35am

Actually @wylie this output was from just one vizualization rather than the whole dashboard. is there anything more I can provide there to help you see where the thing is getting slowed down?

Any idea what the difference between the 'query time' and the time highlighted in green actually comes down to - that might give us a pointer as to the missing 11 seconds?

wylie · April 19, 2021, 2:57pm

Please upgrade to at least 7.9.1, or to the latest 7.12.0, because in 7.9.0 there was a major performance issue in rendering individual visualizations like the Line/Area/Bar charts: [Visualize] fix performance degradation after lodash@4 upgrade by mbondyra · Pull Request #75850 · elastic/kibana · GitHub

It looks like you're sharing the cluster stats for your entire cluster, and I can't identify major issues except that you are running on a single-node instance. Here is our general guide on tuning for search performance

johnwood · April 19, 2021, 11:58pm

Thanks @wylie - I have just done the upgrade to 7.12.0 as the other version is not an option in my cloud account.

The performance of the visualization has remained about the same. In fact that is unfair, the 'query time' number has dropped a little but the green box is still up at around 10-12 seconds and that correlates with my manual timings.

Meantime, does anyone on here know what the difference between the 'query time' and the time highlighted in green actually comes down to - as I previously mentioned, that might give me a pointer as to where to go looking for the extra 11 seconds in this Viz's load time?

Felix_Roessel · April 20, 2021, 4:05am

If you use Elastic Cloud why not opening a support ticket? The support has usually better possibilities to help you as we have here in the discuss area..

wylie · April 20, 2021, 2:41pm

@johnwood We have tooltips in the inspector which explain both timings. The query time is what Elasticsearch reports using took, which ignores any queing. The green box with timing is the total time from sending the request from Kibana to getting a response.

The most likely cause for the slow performance is that your ES single-node cluster is queuing search requests.

system · May 18, 2021, 2:42pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How can I speed up Kibana aggregation? Kibana	10	5508	July 6, 2017
KIbana - Elasticsearch : Request time out after 3000ms Elasticsearch	32	11851	May 10, 2018
Kibana Dashboard performance Kibana	12	4173	October 22, 2017
Visualization in Dashboard Kibana	8	2185	July 6, 2017
Elasticsearch query slow response via kibana console Kibana docker	19	4319	March 25, 2020

Diagnosing slow dashboard/Viz performance in elastic cloud

Related topics