In previous versions of ES, fields stats were used internally by ES to determine which indexes to use when searching. A typical example for timeseries indexes would be to search on a time range, internal field stats would be used to not even attempt searching on indexes for which the time range did not match. The field stats API is gone in more recent versions of ES. Two questions:
*) Does ES still do a smart selection of indexes to avoid touching ES on time range queries based on field statistics for each index?
*) If I want to have access to those per index field stats, is there any way to do it in 5.6.X and 6.X?
Yep! As part of the change, we introduced a new pre-filter phase that executes to find "matching" shards. Each shard can evaluate the query from a high-level and see if it potentially has matching documents (e.g. has documents in the correct time range). The shards that don't have any potentially matching docs will be skipped for the main search phase.
More details here: Add a shard filter search phase to pre-filter shards based on query rewriting by s1monw · Pull Request #25658 · elastic/elasticsearch · GitHub
If you need the field-stats style data, the best way to do it now is just via an aggregation for most of the stats (doc counts, min/max time range, etc). You can also use TermVectors if you need stats about the terms themselves.
Thanks so much for your answer! Exactly what I was looking for!
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.