Run long running search

philippkahr · September 5, 2019, 6:54am

hi guys,

I am struggeling with a simple task. I have netflow indexes that spread over a month and are all in a hot cluster. They combine to up to 200gb and around 700 million log lines. I have a few dashboards and I already set the kibana timeout to 90 seconds to run bigger aggregations and dashboard views for a longer period of time.

I have Elasticsearch with the basic license.

What I want to do: startup a dashboard set the timeframe to a month and just let it run for as long as it takes. I know that my elasticsearch has 100% load on all nodes in the cluster during that time and that other searches are not possible.
Is there any way for me to achieve this? Or do I need to have at least gold license to generate pdf reports that will do it in that way?

Thanks in advance
Philipp

Christian_Dahlqvist · September 5, 2019, 7:09am

Given that PDF reporting still need to query, it is likely to also time out. I therefore do not think that is a solution.

Have you identified why your dashboards are so slow? How many indices and shards to they target? How many visualisations do you have per dashboard? What is limiting performance? CPU? Disk I/O?

philippkahr · September 5, 2019, 7:41am

Hmm, that kind a sucks.

Yeah sure, a lot of aggregations are happening and I can verify that all my nodes have 100% cpu load, while the search is happening. So I guess it is just missing CPU ressources. I guess I will just have to ask for more performance.

Christian_Dahlqvist · September 5, 2019, 8:14am

Check disk utilisation as well as that often is the bottleneck as well.

philippkahr · September 5, 2019, 8:35am

disk and network is not a bottleneck, just CPU ressources. We have all flash storage that is directly connected, so no SMB/NFS or whatever shares and network is at least 10gbits.

philippkahr · September 5, 2019, 1:47pm

Would it be possible to leverage canvas for it? @Christian_Dahlqvist

Christian_Dahlqvist · September 5, 2019, 2:37pm

If Canvas need to run the same aggregations I do not see how it would help given that Elasticsearch seems to be the bottleneck. If you have a lot of small shards you can try consolidating and see if this helps.

philippkahr · September 6, 2019, 6:23am

My index is based on the elastiflow project (https://github.com/robcowart/elastiflow) so my daily index are around 19gb and 43 million documents.
Segment count primaries: 86 and total 163, I do have 6 shards spanned across every node. So i do not know how I could further improve the search performance.

Whilst searching my elasticsearch node has 100% cpu and JVM memory sits somewhere 40-60% , there is no swapping on the server.

4 Cores each @2.5Ghz
8 GB RAM

Christian_Dahlqvist · September 6, 2019, 7:00am

If you have 6 shards per index that is only a few GB per shard, which is quite small. I would recommend trying to increase the average shard size by reducing the number of primary shards or switch to rollover so you can aim for a target size with a flexible period covered. I would also try forcemerging older indices that are no longer written to down to a single segment. Eventually it may however be that you simply need more CPU to power your dashboards.

Altering the structure of dashboards to make them lighter is also an option.

system · October 4, 2019, 7:00am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How can I speed up Kibana aggregation? Kibana	10	5509	July 6, 2017
One search query puts whole cluster on knees Elasticsearch	16	1514	July 6, 2017
Kibana dashbboard times out with large ES data Kibana	4	802	December 25, 2017
Set limits on Elasticsearch & Kibana Elasticsearch	11	1016	March 12, 2019
Monthly search timeout Elasticsearch	2	521	January 9, 2017

Run long running search

Related topics