Performance when querying accross multiple indexes

Hello,

we have an ES setup (v1.7.5) to index logs from network devices (firewall, UTM,...) and we are creating indexes on a per device, per day basis. This means that we have a lot of small indexes with 1 primary shard and 1 replica.

my question is regarding performances: when querying across multiple indexes (because the time range is more than 1 day, or because when want to search for data regarding several devices, or both) we can either use a wildcard or list explicitly the indexes to be considered.

Is there any difference from performance standpoint between wildcard or listing without wildcard ?

Antoine

Having large amounts of very small shards is generally quite inefficient as each shard comes with a certain amount of overhead in terms of file handles and memory usage. This can also affect query performance negatively. I would recommend consolidating into larger shards by reducing the number of indices created per day and/or switching to indices that cover a longer time period.

Having said that you always want to query the smallest number of shards possible, which means it if possible is better to explicitly set the indices to query than use a wildcard pattern.

Although Kibana allows you to specify a wildcard pattern, it actually uses the field stats API to determine which indices matching the pattern that can contain relevant data before submitting the query.

Before this functionality was available in the API, e.g. for ES 1.7.x, Kibana instead used to limit the indices queried based on the date in the index name, which is why it used to be beneficial to configure index patterns with date patterns in Kibana.