Discover tab timing out with single _msearch request

I'm trying to replace a Kibana 4 installation with Kibana 6. I have an index pattern called events-*

When I go to the Discover tab in my Kibana 4 installation, I can see multiple HTTP calls to the _msearch API endpoint, one for each index:
{"index":["events-2017-11-13"],"search_type":"count","ignore_unavailable":true} {"index":["events-2017-11-06"],"search_type":"count","ignore_unavailable":true} {"index":["events-2017-11-30"],"search_type":"count","ignore_unavailable":true} {"index":["events-2017-11-23"],"search_type":"count","ignore_unavailable":true} ...

This incrementally loads and displays the data with each new request. It's fairly slow (I have a lot of data), but it loads.

However in Kibana 6, I'm seeing just one call to _msearch:
{"index":["events-*"],"ignore_unavailable":true,"preference":1512465229818}

Unfortunately there is so much data here that this one call is timing out.

Is there a reason Kibana is trying to do this request with one query? Is there an advanced setting somewhere I'm missing? Or has something changed so that multiple requests are no longer performed?

I've only got a few indexes matching here in Kibana 6 - Maybe it would switch over to using individual queries if I had more indexes? Or maybe this is more to do with how I'm storing the data within each index and the date ranges?

Thanks!

Kibana 6.0 requires Elasticsearch 6.0, which contained a number of performance improvements that allowed Kibana to stop evaluating which indices to query prior to sending the queries.

Thanks for the reply Christian.

I think I understand. Not sure that's a useful behaviour here for us though. I'm guessing that there's no way of reinstating the old behaviour? I suppose I could dramatically increase the timeout, but that means having to wait much longer to start seeing results.

Sadly this looks like it means that Kibana 6 is unusable for us and we'll have to keep using Kibana 4.

Which version of Elasticsearch are you on?

In Kibana 4 the index pattern was expanded based on the dates queried. In Kibana 5 this was replaced by a query to the field stats API to identify exactly which indices behind an index pattern that should be queried. In version 6.0 these improvements have been added into Elasticsearch, meaning that querying lots of shards is efficient and do not suffer the same performance penalty as before.

Trying Elasticsearch 6.0.0 here

Have you tested it at scale and experienced a performance problem?

So, a little more detail. First of all, here's a (sped up) version of what we see on Kibana 4

I have weekly indexes for our data, each containing ~500M documents. Each index has 7 shards, and I've got this running on 2 nodes.

In Kibana 4, multiple _msearch requests are made. Each takes approx 20s to run and Kibana incrementally updates the graph as it loads (as in the above image).

I haven't tested Kibana/ES 6 at quite the same number of indexes and events, but any time I try to scale it I keep getting timeout issues with the single _msearch HTTP request, which I have seen taking over 80s to get the full response. I don't get any timeouts with Kibana 4.

The previous behaviour of showing some documents, then filling in the graph was much better for our use case.

Does that make sense? Thanks once again for your time.

Is that comparison based on the same size hardware being used?

When querying, is the load across you nodes balanced (if not, you may be affected by the issue discussed in this thread).

As _msearch queries do not stream results back, I would guess either of these to have more impact on the performance issue you are seeing than the actual rendering.

Both ELK stacks are running in parallel on the same hardware. Technically speaking, I'm running ELK6 in Docker here but I've not found any evidence that this is making any appreciable difference. Difficult to tell if the load is balanced due to other things happening on the box but a few quick queries it does perhaps seem oddly unbalanced. I think I'd like to do some more testing here to see if this is actually the case.

Edit: I've disabled Logstash writing for now, and yes - it's definitely not balanced properly - almost all of the load is on the Master.

So I've dropped the number of replicas to zero and now the load is spread across the two nodes - at the expense of losing the replicas. Not sure what the best practice is here? Add a third node?

I feel I should add that this is not critical data, nor do we need to deal with that many concurrent users.

I've found the advanced setting courier:maxSegmentCount

Kibana splits requests in the Discover app into segments to limit the size of requests sent to the Elasticsearch cluster. This setting constrains the length of the segment list. Long segment lists can significantly increase request processing time.

Is this something that might be of use to me here? No matter what I set it to it seems to still always do a single request.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.