Kibana 5.2 +x-pack error when generating reports

@kioie are you actively indexing documents while generating the report via Metricbeat/Filebeat/Logstash/something-else and how large are your indexes?

@Brandon_Kobel sure, let me try, will let you know

@Brandon_Kobel yes I am, I have one main index and it has about 1 million documents

After upgrading to 5.2.0, if you're still having issues, it'd be helpful to determine if the report is just taking an extraordinary length of time to generate because we're actively indexing so many documents and trying to generate a report in the same VM. To do so, if you could temporarily stop the indexing and try to generate the report, that'll let us know if there's some resource contention causing the issue or whether reporting itself is having issues running.

I do have a dev cluster running 5.2.0 and I am seeing the same error on 5.2.0. The report that we are trying to generate has 40 lines of entries and the visualization takes less than a second to load.

Still, I am seeing the Kibana took too long (30000) error! I Do see there is a newer version got released now 5.2.1. Is there anything changed with this version interns of reporting.

@matt 5.2.1 doesn't include any reporting changes that would affect this.

Would you mind describing your dev cluster's configuration? Specifically which servers are running which components from the ELK stack and any details regarding your network topology.

I got the full stack. 3 Servers Data + Ingest, 3 dedicated masters
RHEL 7.2 on all boxes,
Java 1.8
X-pack is configured for all products (security, watcher, reporter & Monitoring)

I have a feeling it might be a bug. Unless someone else has run 5.2.0 with reports and no issues.

@matt Are you running Kibana on the servers that run ES, or is it also on a dedicated node? At this point, we've seen this error in two situations:

  1. the kibana server settings are incorrect or not configured https://www.elastic.co/guide/en/x-pack/current/reporting-settings.html#reporting-kibana-server-settings
  2. the server that is running Kibana is resource constrained.
  1. I am running Kibana on the same server as Elastic.

    server.host: "es.fqdn.com"
    elasticsearch.url: "http://es.fqdn.com:9200"
    elasticsearch.preserveHost: true
    kibana.index: ".kibana"
    kibana.defaultAppId: "discover"
    elasticsearch.username: esuser
    elasticsearch.password: espasswd
    server.ssl.key: /opt/ssl/es.key
    server.ssl.cert: /optssl/es.cer
    elasticsearch.requestTimeout: 600000
    elasticsearch.shardTimeout: 0
    logging.verbose: true
    xpack.reporting.queue.timeout: 500000
    xpack.reporting.kibanaServer.protocol: "https"
    xpack.reporting.kibanaServer.hostname: "es.fqdn.com"

  2. It is not resource constrained. Knowing these servers are beefy boxes and has enough CPU cores (32) with lots of RAM (512GB) I will be surprised if it is suffering on that. But we have only 31GB allocated for ES Heap.

@matt If you set logging.events.log: [ 'reporting' ] in you kibana.yml you should have more detailed logs regarding reporting in your kibana logs.

You should see a message like below that has the full URL that reporting is using to generate the report

server    log   [14:02:49.251] [debug][reporting] fetching screenshot of https://localhost:5601/rjo/app/kibana#/visualize/edit/Apache-HTTPD-Load1-slash-5-slash-15?_g=()&_a=(filters:!(),linked:!t,query:(query_string:(query:'*')),uiState:(),vis:(aggs:!((enabled:!t,id:'1',params:(customLabel:'Load%205',field:apache.status.load.5),schema:metric,type:avg),(enabled:!t,id:'2',params:(customInterval:'2h',extended_bounds:(),field:'@timestamp',interval:auto,min_doc_count:1),schema:segment,type:date_histogram),(enabled:!t,id:'3',params:(customLabel:'Load%201',field:apache.status.load.1),schema:metric,type:avg),(enabled:!t,id:'4',params:(customLabel:'Load%2015',field:apache.status.load.15),schema:metric,type:avg),(enabled:!t,id:'5',params:(customLabel:Hostname,field:apache.status.hostname,order:desc,orderBy:'1',row:!t,size:5),schema:split,type:terms)),listeners:(),params:(addLegend:!t,addTimeMarker:!f,addTooltip:!t,defaultYExtents:!f,drawLinesBetweenPoints:!t,interpolate:linear,legendPosition:right,radiusRatio:9,scale:linear,setYExtents:!f,shareYAxis:!t,showCircles:!t,smoothLines:!f,times:!(),yAxis:()),title:'Apache%20HTTPD%20-%20Load1%2F5%2F15',type:line))

If you visit that URL from the server running Kibana, do you see what you tried to generate a report for?

@Brandon_Kobel,

I did add the events.log to my kibana.yml (Thank you for that info!) and I ran another test. There were a lot of info in the kibana.stdout file. I have grep for the reporting & stripped out the statusCode: 200 to focus on the errors only. This is what the snip looks like (Of course I have change the domain names).

{"type":"log","@timestamp":"2017-02-16T22:38:08Z","tags":["reporting","debug"],"pid":120351,"message":"Phantom installed at /app/elk/apps/kibana-5.2.0-linux-x86_64/data/phantomjs-2.1.1-linux-x86_64/bin/phantomjs"}
{"type":"log","@timestamp":"2017-02-16T22:38:08Z","tags":["reporting","debug"],"pid":120351,"message":"Screenshot concurrency: 8"}
{"type":"log","@timestamp":"2017-02-16T22:38:08Z","tags":["reporting","worker","debug"],"pid":120351,"message":"printable_pdf: Registering printable_pdf worker"}
{"type":"log","@timestamp":"2017-02-16T22:40:18Z","tags":["reporting","debug"],"pid":120351,"message":"1 saved object(s) to process"}


{"type":"log","@timestamp":"2017-02-16T22:40:20Z","tags":["reporting","worker","debug"],"pid":120351,"message":"Converting 1 saved object(s) to printable_pdf"}

{"type":"log","@timestamp":"2017-02-16T22:40:20Z","tags":["reporting","debug"],"pid":120351,"message":"fetching screenshot of https://es.fqdn.com:5601/app/kibana#/discover/Cpu-Load-stats?_g=()&_a=(columns:!(_source),filters:!(),index:'metricbeat-*',interval:auto,query:(query_string:(analyze_wildcard:!t,query:'metricset.module:%20system%20AND%20(metricset.name:%20cpu%20OR%20%20metricset.name:%20load)')),sort:!('@timestamp',desc))"}

{"type":"log","@timestamp":"2017-02-16T22:40:23Z","tags":["reporting","debug","phantomConsole"],"pid":120351,"message":"INFO: 2017-02-16T22:40:23Z\n  Adding connection to https://es.fqdn.com:5601/es_admin\n\n"}
{"type":"log","@timestamp":"2017-02-16T22:40:23Z","tags":["reporting","debug","phantomConsole"],"pid":120351,"message":"INFO: 2017-02-16T22:40:23Z\n  Adding connection to https://es.fqdn.com:5601/elasticsearch\n\n"}
{"type":"log","@timestamp":"2017-02-16T22:40:23Z","tags":["reporting","debug","phantomConsole"],"pid":120351,"message":"setting active api to [es_5_0]"}
{"type":"log","@timestamp":"2017-02-16T22:40:23Z","tags":["reporting","debug","phantomConsole"],"pid":120351,"message":"Index Patterns: index pattern set to metricbeat-*"}
{"type":"log","@timestamp":"2017-02-16T22:40:23Z","tags":["reporting","debug","phantomConsole"],"pid":120351,"message":"complete in 142.77ms"}
{"type":"log","@timestamp":"2017-02-16T22:40:50Z","tags":["reporting","worker","debug"],"pid":120351,"message":"printable_pdf: Job timeout exceeded: (iz8yzait2kv3b6e100855ywo)"}

{"type":"log","@timestamp":"2017-02-16T22:40:52Z","tags":["reporting","debug"],"pid":120351,"message":"Screenshot failed Kibana took too long to load - Timeout exceeded (30000)"}

Don't know where this could be wrong. We have another cluster running 5.1.2 and it does not do this kind of stuffs! Only on 5.2.0 so far!

@matt if you visit the following URL, do you see the discover page you were anticipating? https://es.fqdn.com:5601/app/kibana#/discover/Cpu-Load-stats?_g=()&_a=(columns:!(_source),filters:!(),index:'metricbeat-*',interval:auto,query:(query_string:(analyze_wildcard:!t,query:'metricset.module:%20system%20AND%20(metricset.name:%20cpu%20OR%20%20metricset.name:%20load)')),sort:!('@timestamp',desc))

It'd be preferable for you to visit this page from the server itself running Kibana.

@Brandon_Kobel,

Yes, I could see the results if I visit that URL. My Kibana instances are running on the same server that Elastic is running on.

If you mean to say to run this on the server, I can't do that as my servers don't run X-windows. Can I do this Kibana action on a command line? Even a curl request?

Just wondering if Elastic has done testing on RHEL7 + 5.2.0 + x-pack reporting on a multi node cluster.

Also, I feel the reporting from Elastic is very inferior compare to some of the 3rd party products that could do csv exports and a better PDF export.

@matt

A curl request is fine, I just wanted to make sure that the URL resolves properly to Kibana.

I'm unable to speak directly towards testing on that exact configuration; however, we have a number of customers running 5.2.0 multi-node clusters without issues.

With regard to the lack of features, we have a number of enhancements in the pipeline, specifically regarding more layout options and additional export formats. Would you mind elaborating upon the improvements that you'd like to see in the PDF export?

@Brandon_Kobel

Hi Brandon,

Firstly, thanks for your explanation.

I have exactly the same problme. I'm using the Filebeat + Logstash + Elasticsearch + Kibana (all of them are in the last version 5.2.1) on one VM - 6 processors & 8 Go.

The OS is RHEL7.1 and I installed them by RPM (execpt for plug-in X-Pack of course). I started them by systemctl then the same problem occurred :

> Error: Kibana took too long to load - Timeout exceeded (30000)

I followed your suggestions to activate kibana log for more information and i found :

{"type":"log","@timestamp":"2017-02-22T11:07:58Z","tags":["reporting","debug"],"pid":1469,"message":"fetching screenshot of http://10.151.27.179:5601/app/kibana#/visualize/edit/707633e0-f847-11e6-befb-ff4ea7f891b9?_g=(time:(from:'2017-02-16T23:10:16.092Z',mode:absolute,to:'2017-02-17T18:35:18.291Z'))&_a=(filters:!(),options:(darkTheme:!f),query:(query_string:(analyze_wildcard:!t,query:'*')),title:'Dashboard%20-%20Apache%20-%20EDF%20-%20test')"}
.......
 {"type":"log","@timestamp":"2017-02-22T11:08:31Z","tags":["reporting","debug"],"pid":1469,"message":"Screenshot failed Kibana took too long to load - Timeout exceeded (30000)"}
{"type":"log","@timestamp":"2017-02-22T11:08:32Z","tags":["reporting","worker","debug"],"pid":1469,"message":"printable_pdf: Worker error: (izguw1re014tc7f584cs743p)"}

I tried to access the URL in the "fetching" line by browser in the VM, it had the same error : Timeout exceeded (30000)

I also ajusted the timeout to 60000 in kibana.yml but it seems nothing changed (still timeout after 30 seconds) :

server.host: "0.0.0.0"
server.name: "new"
logging.dest: /var/log/kibana/kibana.log
logging.events.log: [ 'reporting' ]
xpack.reporting.kibanaServer.protocol: http
xpack.reporting.kibanaServer.hostname: 10.151.27.179
xpack.reporting.queue.timeout: 60000

BTW, I'm in DEV evironement so there is no proxy in front and I have only one node.

Do you have any idea ?

@Apache_HOU so when you go to http://10.151.27.179:5601/app/kibana#/visualize/edit/707633e0-f847-11e6-befb-ff4ea7f891b9?_g=(time:(from:'2017-02-16T23:10:16.092Z',mode:absolute,to:'2017-02-17T18:35:18.291Z'))&_a=(filters:!(),options:(darkTheme:!f),query:(query_string:(analyze_wildcard:!t,query:'*')),title:'Dashboard%20-%20Apache%20-%20EDF%20-%20test') kibana doesn't load?

What URL are you normally using to access Kibana?

Here're my responses:

  1. When I went to this URL in my VM brower, kibana loaded with the visualization chart. But after about 30 seconds, the error Error: Kibana took too long to load - Timeout exceeded (30000) occurred

  2. The URL i access to Kibana is http://10.151.27.179:5601

@Brandon_Kobel

Some updates : It works now but still confused ...

I just went home to use another network environement. Of course, i changed the value in kibanan.yml: xpack.reporting.kibanaServer.hostname: 192.168.0.17

I also changed the values in jvm.options on elasticsearch side to the default values but i don't think there's a link between them...

-Xms2g
-Xmx2g

Howerver, could you please confrim if the value xpack.reporting.queue.timeout is hard-coded ? As I find in the offical document, it mentions to modify this value if machine is slow....

https://www.elastic.co/guide/en/x-pack/current/reporting-settings.html#reporting-kibana-server-settings

@Apache_HOU there's another timeout that is hard-coded that is used for the Kibana took too long to load error, it's essentially waiting 30 seconds for the .application query selector to find something, which we didn't anticipate ever taking 30 seconds.

It generally happens when the Reporting KibanaServer properties are incorrect; however, in very isolated circumstance we've seen it happen when the server that Kibana runs on is under a large amount of load.

@Brandon_Kobel

Thanks for the reply.

If I understood correctly, the timeout 30 seconds for query selector is hard-coded and timeout for the queueing mechanism which corresponds to xpack.reporting.queue.timeout is configurable, right ?

Since both of these 2 timeouts will display the Kibana took too long to load error, how do we know which timeout casue this error ? In my case, i haven't got the real reason yet...