Monitoring Stats Explanation

I'm trying to understand the monitoring stats.

On a mostly unused instance of 6.2.2 I noticed that the number of requests in the last hour are over 3000 and the number of connections are over 7000. Again this is an instance that is in a test environment with no one but the small development team using it (maybe 3 people have used it in the last week). Both the number of requests and connections never go down. I've left the monitoring page open for over an hour without a refresh interval and a time range of 1 hour. I periodically refreshed this page and saw the numbers are growing and never going down. Changing the time range to 1 minute ago and the numbers are the same as they are for 1 hour ago. Looking at the graphs for Client Requests and HTTP Connections and the numbers don't line up at all to these charts.

Did you perchance change the statistics on the monitoring page to be the cumulative value of requests and connections while also disregarding the time range on the monitoring page? I'm trying to figure out how these numbers can be so large for a mostly unused instance of Kibana.

Hi @brandtj,

From what I can tell, the queries here are looking at max aggregations, but some are using the derivate aggregation which might affecting this.

I'm going to ping a few folks who work on this more specifically to see if they have more insight.

@pickypg @tsullivan

If you're looking at the Kibana dashboard charts in the Monitoring app, then that doesn't sound normal. The charts are driven from data collected via Hapi ops events, which are ever-increasing counters, so they take the derivative of the counters to plot whether the numbers go up or down or stay the same.

If you're looking at the raw data of and kibana_stats.concurrent_connections, those numbers are counters and only accumulate upwards, so what you are saying would be expected.

Would you be able to provide a screenshot of the request chart and connection charts that you see?

Hi again,

A team member reminded me that there was a bug with the way this data was collected and uploaded in 6.2.x. It has been fixed in 6.3.0 though. I will see about having the fix backported so it can be available in some version of 6.2.x.

Sorry for the confusion!!!

Sorry to bug you again but I just want some clarification on what is fixed.

This screenshot below is the main monitoring page that shows in the last hour 3800
requests and 8700 connections.

This screenshot is the charts for the same instance and same last hour that shows much fewer requests and connections.

This is the discrepancy I was referring to. Is this the same thing you are talking about? How specifically was this fixed? From what I can tell nothing on the main monitoring page uses the time provided. Thanks in advance.

I see what you mean now. The charts look fine (the bug I was thinking of was for the charts and it did get a fix for 6.2.x), but it looks like there is an issue with the Kibana stats shown in the Cluster Overview and it seems this issue was previously unknown. I'll file an issue in our internal repo and give it some attention.


This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.