I have a concern regarding the change in Kibana 6.x to no longer support "time-interval based index patterns". Instead, we now use wildcards. Below is an example of my concern.
Let's say we have a large number of daily indices (i.e., several hundreds for the past few years) and only intend to search over data in the past 24H. Isn't it extremely inefficient to search over ALL of these indices when all relevant data is contained in just 2 indices (today and yesterday) at most? Prior to Kibana 6, it would smartly restrict the indices that are searched over based on the time range selected.
By the way, I did find a similar post to my concern here:
@Christian_Dahlqvist responded saying this: "more recently querying all indices has been made much more efficient in Elasticsearch". Does this mean that there were improvements made to the time range query performance? Is the overhead (of querying hundreds of indices versus just 1-2 in my example) so minimal that it's not worth restricting queries to only the relevant (time-based) indices?
I want to add some more details to this post, after some quick performance tests. We are seeing a considerable performance hit because of this change on our large cluster.
Below is a quick test I ran to compare query execution times. Both queries return the same results (verified counts).
Sorry to hear that you're suffering from reduced performance. Lets see if we can fix that.
Do you have a time filter field name selected for your index pattern?
Do you have a rough idea of the size of your elasticsearch cluster? Number of nodes?
I appreciate the response. We do have a time filter field name selected for the index pattern. Also, we have a large number of nodes and data, each of which is carrying a heavy burden (> 1000 active shards / node). I know we're pushing the limits here, but it still bothers me that Kibana isn't restricting the time-based indices it's searching over when we know the data is only contained within those indices.
My main question is why was this feature removed to begin with? More specifically, what changes were made in elasticsearch to make querying over all indices/shards so efficient compared to elastic v2? It seems like we'd still have to iterate over all documents to see if it falls within the requested time range.
My main question is why was this feature removed to begin with?
An optimization was added to elasticsearch that should make the time based index pattern unnecessary. Obviously this isn't your experience. I'm going to do some research and get back to you.
It sounds like part of the problem is that you have too many shards in your cluster. Exactly how many shards do you have? How large data volume does this correspond to? How many data nodes do you have?
The old scheme based on index names was performance, but not very flexible. If data ended up in the wrong index it would give incorrect results and it could not be used with rollover indices. It was replaced by a field stats check, which for large clusters could be slow. This type of check was then optimised in Elasticsearch and merged into the query phase which I believe resulted in the current solution.
Thanks for the reply Christian. I definitely agree that the root of our problem is high shard / data node ratio, and we are working to reduce that for future indices. Unfortunately it won't happen right away and will only improve over time.
As for your explanation on the changes, I have a few follow up questions:
Why not keep both schemes for querying Kibana, including the higher performing time-interval based index patterns even with the known caveats. The caveats you mentioned are a non-issue for us, and performance is critical.
I noticed that the field stats check is now deprecated in elastic 6.x and to instead use the field caps API. However, the field caps API does not have a way retrieve min/max values for a date field (which I assume is what was used to determine whether we need to search a particular index or not). Instead, it suggests running a min/max aggregation, which seems expensive. Can you please elaborate on what optimization was made to the query phase? I'd like to understand it better.
Hi @avencar. Sorry for the issues that you're facing. I think I might be able to offer some explanations for what is happening behind the scenes within Elasticsearch. First, some clarifying questions:
Which minor 6.x version are you on?
How many nodes do you have in your cluster?
How many shards do you have per index being hit in these search requests?
How many shards in total are being hit in these search requests?
Can you please temporarily enable the search slow log with a very low threshold (1nanos), execute the "slow" query, and share here the rewritten query from a shard where you expect the query to not match any documents? I want to ensure that these are being rewritten to match none queries. Don't forget to null out the setting when you're done, so that your slow logs are not spammed with the execution of every query.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.