All dashboard visualisations in 'stalled' state for 30s to 1m

robbav · October 8, 2019, 8:04am

First post here, hoping to get some assistance with this issue, or tell me if this is normal behavior.

Background:

We have a cluster of 3 nodes with 32Gb each running on VM / docker. Each node has 16Gb allocated to Java heap.
Our ingest is about 10Gb a day into a daily index with three primary shards, and 1 replica.
Indexing seems fine, with a rate of about 500/s across all shards, with a latency of around 0.7ms
We have one dashboard with about 18 visualisations (9 bar graphs, 9 tables) on it
When viewing the dashboard, the search rate, and search latency can't be seen because kibana stops responding (i know we should monitor from a different cluster for exactly this reason).
Rough numbers during report generation are client response times of around 25-30s, HTTP connections around 80, client requests around 175

Issue:
The dashboard takes about 1 minute to load fully, and there is only 10 days (10x10Gb of logs) of data to aggregate. We want this to be a monthly report, so if it takes 3 minutes to do 30 days of data (300Gb), it seems too long.

Observations:
When looking at the stack monitoring, the system load for each node goes to about 4-6 for each node during report creation.
When looking at the Network waterfall for the data requests - the stalled state increases from first visualisation to last- ie the first visualisation is in a stalled state for 9 seconds and the last in a stalled state for 48 seconds. The TTFB is about 15s across the board:

Queued at 384.48 ms
Started at 385.46 ms
Resource Scheduling TIME Queueing
0.98 ms
Connection Start TIME Stalled
47.49 s
Request/Response TIME Request sent
0.17 ms
Waiting (TTFB)
16.41 s
Content Download

Firstly, is this normal behavior with the stalled state?
Secondly, if not, how to determine whats causing this stalled state?

Cheers
Rob.

Nathan_Reese · October 11, 2019, 2:23pm

What version of kibana are you running? What browser are you running? What do your visualizations look like? Are they rendering large amounts of bars with lots of splits?

robbav · October 15, 2019, 2:00am

Hi
On 7.3.1 across the stack.
Using Chrome, or IE produces the same "stalled" state time frames.

I have done more investigation, and in case others have tried to do this, ill post what the issue was:
At the top of the report i have a drop down list that shows the client list to filter the dashboard by client.
I used the markdown feature of tsvb to insert that client name in the description text for each visualisation on the dashboard.
The text embedded with the client name (ironically) is what is causing the dashboard to take minutes to load.
Once I removed the client name lookup from the text the dashboard goes down to a respectable 30s or so to load.

I would appreciate if anyone has a better way of doing this?
I tried creating an index just with the clients names and using that as a lookup for the drop down list and filter for the dashboard. Even though the client name data comes from a different index, I used the same variable name, and it seems to filter the visuals correctly.
Is there a better way though?
Thanks,
Rob.

system · November 12, 2019, 2:12am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Kibana dashboard visualisation slowness while accessing the dashboard - Kibana	2	216	July 26, 2023
Kibana getting hanged, and getting Wait response or Kill page Kibana lens	4	21	October 23, 2024
Performance issue on Kibana Kibana	14	2186	November 20, 2018
Track Visualization and Dashboard Usage and Performance Kibana	5	1709	August 30, 2022
Visualize: Request Timeout after 90000ms Kibana	3	703	May 17, 2018

All dashboard visualisations in 'stalled' state for 30s to 1m

Queued at 384.48 ms Started at 385.46 ms Resource Scheduling TIME Queueing ​0.98 ms Connection Start TIME Stalled ​47.49 s Request/Response TIME Request sent ​0.17 ms Waiting (TTFB) ​ 16.41 s Content Download

Related topics

Queued at 384.48 ms
Started at 385.46 ms
Resource Scheduling TIME Queueing
0.98 ms
Connection Start TIME Stalled
47.49 s
Request/Response TIME Request sent
0.17 ms
Waiting (TTFB)
16.41 s
Content Download