Discovery: very slow data visualisation


I am storing my application's log files in ES (5.2.0). About 15k-30k rps depending on the time of the day, about 1 bil log lines per day.

My ES cluster consists of 7 32-core machines, each with 6x 4TB SATA HDD RAID6.

I store logs in daily indexes, 7 replicas for each index with 1 replica shard.

I perform tests with all filebeats stopped (so cluster is idle and serving only my requests from Kibana).

When I go to Discovery tab at Kibana and choose "Last 24 hours" time interval, it takes about 20-25 seconds to draw a histogram.
When I choose "Last 7 days" time interval, it takes about 20-25 seconds to draw a histogram for each day, so it takes few minutes to draw a histogram for the whole week, and often I get "Discover: Request Timeout after 30000ms" error message.

On servers I see no resource shortage: ES's java process consumes about 100-200% of CPU (these are 32-core machines), disks utilisation is about 20% according to iostat. Network is almost idle. So I see no resource shortage which can slow things down.

Is it an expected performance for such a cluster setup and amount of data stored?
I feel it too slow and I want to know if I am missing some obvious tuning which can speedup such requests?

Thanks in advance.

Hey @John16, if you click the following arrow it'll expand the "Spy Panel"

it'll then let you view the statistics for the underlying Elasticsearch Queries:

If the Query duration is a majority of the Request duration, the bottleneck is with the Elasticsearch query. If that is the case, you'll probably get better responses if you post the same question in the Elasticsearch forums. To see the underlying Elasticsearch query, you can use the Request tab:

Yes, it is all Query Duration.
I'll post this to ES forum per your suggestion.


Have you gone through a shard-0sizing exercise as described in this video in order to determine the ideal shard size for your use case? Each query/aggregation executes single-threaded against each shard. Several shards can be processed in parallel and multiple queries/aggregations against a single shard can also be processed in parallel. Depending on what your dashboards look like and how much of the data they aggregate over, the shard size will have an impact on the minimum query latency.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.