We are using Index Roll Over api to create new index based on size. We are having 1 alias for write and another alias for read. Read alias is obviously pointing to all indices.
For range query, we want to limit search to just indices created in the time range only instead of going through all indices. Does Elasticsearch already filter out right indices based on range when we send range query to the read alias (that has all indices) or we really have to keep track of indices and their created time to pick right indices and add index filter in the range query request?
I am also aware that we can create multiple aliases (like 1day, 1week, 1month...) and manage to assign index to proper alias when create new index however this approach is not flexible and not efficient enough to filter out only indices that were created in a range query.
In earlier versions of the stack kibana used to determine exactly which indices to query, first based on the index name and then later based on field stats. This was done as querying indices that did not have any data had a significant impact on performance. This has however been greatly improved within Elasticsearch (since late 5.x releases) and Kibana now no longer needs to perform this extra step now queries all indices that matches the pattern. If Kibana no longer need to do this, it is quite likely that you do not need to worry too much about it either.
If you really want to be sure I would recommend running a benchmark.
Thanks a lot Christian! it sounds great. We will run some benchmark for this.
Is there any document that explains more detail about the change in 5.x that you can share? (just want to understand more detail the approach inside Elasticsearch that helps to improve performance).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.