Omit indices when searching on multiple indices

ebuildy · January 24, 2023, 7:31pm

A question and maybe a feature request. We have 100 indices for filebeat with this pattern:

filebeat-{CLUSTER_NAME}-{NAMESPACE}-{DATE}

.

Documents are:

{"@timestamp" : "..." , "cluster" : "prod", "namespace" : "kube-system", ... "message": "hello world"}

I am wondering, when we search against all filebeat indices, for a particular "time range" and "cluster", if elasticsearch is smart enough to select first potential good indices?

stephenb · January 25, 2023, 3:07am

Hi @ebuildy

What version are you using?

The short answer is there are "Smarts" built into elasticsearch to "Prefetch/Limit" the applicable indices based on timestamps with respect to the time range of your search IF you are doing normal times series data ingestion with rollover/daily etc ILM etc and not reopening and writing to them etc.. etc..

Elastic will not know about that CLUSTER or NAMESPACE name in the index name and will not pre-filter that UNLESS you create a data view or something to limit the search upfront

So your answer is yes and no....

Others may have more details... but that is my top-level understanding

ebuildy · January 25, 2023, 1:32pm

Very interesting,

I know this is the 1st optimisation step of some DBs: "dont open file / resource if you dont need it".

I am a big fan of Apache Spark and all data stuff (parquet, data lake etc...), they do something called "Partition pruning" to work only on good files, elasticsearch could implement this concept.

If I do the analogy with Parquet file format, elasticsearch could save for each index the min and max value for time fields, terms values for string fields (with a limit), and pre-filter indices before doing the search.

So as a good advice, this is better to group all documents by date indices (less indices but bigger): "filebeat-YYYY-MM-DD" than indices like (more indices but smaller) "filebeat-{PRODUCT}-YYYY-WEEK"

Thanks you,

system · February 22, 2023, 2:21pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Indexes with FileBeat Beats filebeat	8	4397	September 27, 2016
Elastic not following their own advice to prevent sparsity? Beats	3	368	April 5, 2019
Filebeat trouble with separate indexes per namespace Beats filebeat	1	1175	November 17, 2021
Automatically delete 1 month old records/documents(without deleting index) in elastic search/kibana Elasticsearch	21	13378	January 10, 2019
Help with reducing mapping Kibana	10	1045	February 28, 2021

Omit indices when searching on multiple indices

Related topics