Kibana won't load any page, causes Elasticsearch plugin timeoutes

I recently upgraded my Elasticsearch nodes from 2.x to 5.3.2. Afterward, I installed a new server with an Elasticsearch client (master & date = False), and a new Kibana 5.3.2 instance on that node. Every time I go to load a page (any page except status) in Kibana, it hangs. Then it says the elasticsearch plugin timed out after 30 seconds. The Kibana status page reports the elasticsearch plugin as being red for over 2 minutes before going to green again. Doesn't matter how many times or what page I load in Kibana, the same thing happens.

At the same time, I also use the elasticsearch-head service. I have that loaded with a refresh interval of 5s. Whenever I load a Kibana page, elasticsearch-head will also stop refreshing until the 2 minutes pass, then refreshing resumes.

I've also tried running a script I have from the older 2.x version that just runs a specific query over my indexes. I find that this script runs fine running regular queries when Kibana is "stuck," so it's not a complete freeze of the Elasticsearch system.

I've tried enabling various debug logging modes in Kibana, and it's not telling me what's getting hung up in elasticsearch. Here is a link to the Kibana logs in verbose mode for the duration of the issue:

https://pastebin.com/95qyNnN0

I'm not sure where to go from here. I've been poking and prodding for a few days now with no luck on restoring Kibana for my team to use. Any assistance is greatly appreciated.

Tom

Every time I go to load a page (any page except status) in Kibana, it hangs. Then it says the elasticsearch plugin timed out after 30 seconds. The Kibana status page reports the elasticsearch plugin as being red for over 2 minutes before going to green again.

Do you see excessive load on the machine, in particular from Elasticsearch? The red status is usually the result of a request from Kibana that pings ES to try to check its status. When it's red, it means either it can't talk to the server, or it timed out for some reason.

This request from your logs is concerning:

GET /api/kibana/logstash-*/field_capabilities 503 30013ms - 9.0B

This leads me to believe it's not actually a problem with the heartbeat, but something's up with your data. Do you have data in your logstash-* indices on the new 5.3.2 cluster? That request will fail when you don't...

I have a few TBs of data in the logstash-* indexes. I also have
metricbeat-* and vdbench-* indexes that are a bit smaller. I don't see
any unusually high load on my elasticsearch cluster when I try to load a
Kibana page.

I will say that the logstash-* index does have a few thousand fields
defined in it. Seems like it might be related to that? The old version
didn't seem to care.

It could be related. A large number of fields will certainly slow things down, because it increases the amount of data that is sent across the wire. But I don't think you should be any worse off in 5.x than you were in 2.x.

I know the field_capabilities API is really new though. In fact, the docs say it isn't even available in 5.3. Based on this Kibana issue though, and the fact that you see 5.3.2 using it, that can't be true. Can you check your Elasticsearch logs and see if you see any errors there?

Also, can you tell me what version of 2.x you upgraded from too? Maybe there's a weird upgrade path issue too.

I've been staring at the logs on my elasticsearch cluster every time I try to launch kibana. I don't see anything unusual or error-like relating to kibana. Just the occasional logstash index error because a number is larger than a "long".

I know I upgraded from Elasticsearch 2.4.x. I don't remember what x is however, and grepping the log files for "version" is not helping narrow it down either.

Bump

I still have a 100% unusable ELK stack... Does anybody have any ideas on what I can do in order to get this working??

Thanks.

FYI, upgrading to ELK 5.4.0 does not fix the situation.

I was bored and decided to enable trace level logging on the elasticsearch client that kibana is using. What I found are multiple calls to "[indices:data/read/field_stats[s]]" that in some cases take a very long time to get a response. Here is a log of about 50 seconds worth of log data: https://pastebin.com/DJnj9Lxp

Note that Kibana will timeout after 30 seconds, if it's asking for "field stats" for every field in the index, I have thousands... this query may never finish! What can I do to disable this, or otherwise restore Kibana to how it behaved in version 2.4 so I can get it back to a usable state?

I have just migrated to a 5.3 AWS ES Domain and have exactly the same problem.
I can run this API query manually for a month pattern at a time and it will return Ok.
It just cant handle the wildcard for the whole default logstash pattern

Kibana is completely crippled for the logstash-* index pattern.

Is there anything configurable to stop this behaviour.

I wish you luck finding an answer. I never did, and had to downgrade to the pre 5.x versions, loosing all my data.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.