I believe this is an Elasticsearch problem although it manifests itself in Kibana.
I'll apologize upfront -- this is my first post. I'll try and provide what is needed.
We're using ELK to gather the logs of about 1,500 machines including F5s and ESXi. They are sorted into 8 different daily indexes totaling 6.6B documents over the 30 day retention. In monitoring everything is green. The indexes, nodes, and Kibana. I've been unable to refresh the index for our default index which is logstash. It is by far our largest index as well. At about 100m documents daily and a data size of around 220GB. There are 5 shards and one replica. We have 10 data nodes with 5 at each site and a fast pipe between them. When I would try to refresh it within Kibana it would timeout.
Yesterday I upgraded it via Puppet from version 5.4.1 to 5.5.0. Now I can't access the data in the logstash indexes either. When I hit Kibana it just spins and then errors. I can see the data in other indexes. The error looks like this:
Error: Request to Elasticsearch failed: "Bad Request"
at http://kibana.example.com/bundles/kibana.bundle.js?v=15382:229:4009
at processQueue (http://kibana.example.com/bundles/commons.bundle.js?v=15382:38:23621)
at http://kibana.example.com/bundles/commons.bundle.js?v=15382:38:23888
at Scope.$eval (http://kibana.example.com/bundles/commons.bundle.js?v=15382:39:4619)
at Scope.$digest (http://kibana.example.com/bundles/commons.bundle.js?v=15382:39:2359)
at Scope.$apply (http://kibana.example.com/bundles/commons.bundle.js?v=15382:39:5037)
at done (http://kibana.example.com/bundles/commons.bundle.js?v=15382:37:25027)
at completeRequest (http://kibana.example.com/bundles/commons.bundle.js?v=15382:37:28702)
at XMLHttpRequest.xhr.onload (http://kibana.example.com/bundles/commons.bundle.js?v=15382:37:29634)
The Kibana node is also an Elasticsearch node but doesn't hold data. I don't see anything of significance in either the Kibana log or the ES log with log level set to info. Here is a Kibana timeout log message:
{
"type":"response",
"@timestamp":"2017-07-14T15:35:07Z",
"tags":[],
"pid":22484,
"method":"post",
"statusCode":400,
"req":{
"url":"/api/console/proxy?path=_mapping&method=GET",
"method":"post",
"headers":{
"host":"kibana.example.com",
"connection":"keep-alive",
"content-length":"0",
"accept":"text/plain, */*; q=0.01",
"origin":"http://kibana.example.com",
"kbn-version":"5.4.1",
"user-agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit /537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36",
"referer":"http://kibana.example.com/app/kibana",
"accept-encoding":"gzip, deflate",
"accept-language":"en-US,en;q=0.8"
},
"remoteAddress":"192.168.59.3",
"userA gent":"192.168.59.3",
"referer":"http://kibana.example.com/app/kibana"
},
"res":{
"statusCode":400,
"responseTime":39,
"contentLength":9
},
"message":"POST /api/console/proxy?path=_mapping&method=GET 400 39ms - 9.0B"
}
The entire ELK stack is at 5.5.0.
Thanks,
Peter