We have Kibana Dashboard with ~35 visualizations. ~10 of them are using pipeline aggregation.
Kibana & Elasticsearch are installed on machine with 16 CPU cores, 64 MB memory (we set 32 for ES).
We have about 5 million documents (~2 GB).
but it takes too much time to load the Dashboard - ~80-90 seconds.
What can we do?
Is 35 visualiztions include 10 with pipeline aggregation is too much for Dashboard?
Are the visualizations could be loaded in parallel or only one after one?
We noticed for the pipeline aggregation visualizations there is a big difference between the query time to the response time. We suspect that this is happen because the pipeline aggregation return in the response all the buckets in the pipe even we actually don't need them in the result - we want only the final numbers, and the parsing of the json with all the buckets takes a lot of time. We tried in Dev Tool to run the query with filter_path which exclude the buckets, and the response was much faster. Is there a way to do that in the Visualization?
Kibana sends all queries related to visualisations in a dashboard in a single _msearch request, which executes in parallel. What does CPU utilisation and diskI/O and iowait look like on the Elasticsearch node while you are querying? How many shards are you addressing with the query?
Then it seems it is largely CPU limited. You do have very small shards, so it may help to reduce the number of primary shards per index to 1, e.g. using the shrink index API, but it may also be that you just need more CPU to support all the processing you are doing. 35 visualisations son a single dashboard is quite a lot though and I would guess it is very busy. Would it perhaps make sense to break it up somehow?
I would also look into disk I/O and iowait, as it could potentially be caused by slow storage.
What about the buckets returned in the response? We are doing sum of field per buckets of userid and there are many users and then average bucket. so we need only the final number, but the response returns all the users too, and we think this takes most of the time - when running only one visualization we see the query time is only 20-25% of the time
We worked around the problem by using a proxy between Kibana to Elastic, which send the parameter "filter_path" with the "_msearch" request, to filter out the buckets information from the response:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.