I am experiencing frequent instance crashes when attempting to export default dashboards (PDF/PNG/CSV) from Kibana. I am currently using Elastic Cloud and do not have access to the underlying terminal or the physical kibana.yml file to adjust server settings.
This results in the instance crashing with 502 bad gateway error message from Nginx.
I would like to implement best practices to stabilize these exports. Specifically, I am looking for the correct way to apply the following via the Elastic Cloud Console's "User settings overrides":
Reporting Timeouts: Increasing xpack.reporting.queue.timeout to allow more time for complex dashboard rendering.
Memory Allocation: Since I cannot set NODE_OPTIONS via terminal, what is the recommended way to ensure Kibana has enough RAM for heavy X-Pack reporting tasks on Cloud?
CSV Size Limits: Safely adjusting xpack.reporting.csv.maxSizeBytes without destabilizing the node.
PDF Size Limits: xpack.reporting.queue.timeout, xpack.reporting.kibanaServer.hostname and xpack.reporting.thumbnails.enabled: false
Could you provide an example of the YAML snippet I should paste into the Kibana user settings section of my deployment to fix this?
Below documentation could help to understand all the available settings :
In the Kibana logs what is the reason for it to crash , this could help us understand if the memory available is not sufficient or it needs to be increased at node level?
If the memory is not sufficient i believe updating below parameters might still not help & the instance might crash.
The memory available for Kibana can be checked via ECE console & monitor the Memory Size via Stack Monitoring :
From the documentation we see that default is 4m / 250mb you will have to modify as per your environment/performance :
xpack.reporting.queue.timeout : 8m (if you want to increase this to 8m)
xpack.reporting.csv.maxSizeBytes: 524288000 (if you want to increase this to 500mb)
xpack.reporting.kibanaServer.hostname => Optional as per documentation
I no longer have access to the instance, but my prior investigation showed no internal logs regarding the incident. This suggests the root cause is likely either a JVM Heap issue related to memory allocation or a misconfiguration within the Nginx proxy of the instance itself.
PS: Stack Monitoring shows no changes before the crash.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.