Understanding "skipped" shards

MaJaHa95 · July 24, 2018, 4:51pm

Hey,

I've been a long-time user of the _field_stats API for index filtering. I have two production systems in place where queries typically filter on some category identifier (namely an ID, in the form of an integer) and date.

However-long-ago, I bumped up to 6.x, and consequently was no longer able to use that API to pre-filter indices. Since then, I've been unimpressed with these queries' performance. These systems have evolved over time, so I can't necessarily blame the API removal in whole, but I'd like to understand the alternative that was put in place.

My indices generally take the form of, for example: index-name-{CategoryId}-{yyyyMM}.

When I run a term on the field corresponding to {CategoryId}, and a range on {yyyyMM}, I sometimes get back a _shards element with a non-zero "skipped." Is that coming from the pre-filtering that's done at query time?

Similarly, I often find that there are no skipped shards, even when I know there should be.

I've had to go back to a most-unfortunate workaround of late, wherein I expand out index names, especially for {CategoryId}. This improves query time substantially; more than the few ticks the _index_stats would have cost me.

Is there a way for me to troubleshoot this pre-filter phase? Does it apply only to range queries, and not term/terms ones?

I've tried running the profile option, and tons of indices show up. But I don't know whether it's truly damning for an index to show up in that, or if pre-filter includes it there. They don't look like they're being filtered out--they have all the usual nodes that an index with no match would have. But then I'm not sure.

So yeah, basically just looking for docs (I haven't found any reference to the "skipped" response field at all), or anecdotal suggestions, or anything that can help me understand which shards are skipped and why, and which aren't and why.

I'm definitely a proponent of the engine being intelligent enough to do this, I just need some more visibility into why it might not be working for me.

Thanks,
Matthew

system · August 21, 2018, 4:51pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Are shards skipped when a range query does not create hits Elasticsearch	2	4216	March 17, 2018
Shards skip on range query Elasticsearch	1	1325	October 12, 2018
Automatic skipping of indexes / shards for date-based indexing and index sorting Elasticsearch	3	300	November 22, 2022
Definition for "skipped" in "_shards" as query result Elasticsearch	3	1857	January 10, 2019
How to count up "skipped shards" in "_shards" as query result Elasticsearch	2	1361	January 10, 2019

Understanding "skipped" shards

Related topics