Why is IDs query slow?

Nikita_Tovstoles · September 11, 2015, 5:47pm

Thanks for all the suggestions thus far, folks. I think the issue lies in the cost of doing source filtering on matching docs - either via _source= or fields= The challenge now is figuring out the best way to minimize that cost.

Let me step back:

/programs has many properties, amongst them id, name, and areas. The latter is an array and is large - on the order of 40K entries for some docs
I am interested in retrieving only id, name (and a few other small properties) of docs matching the IDs query. I achieve that via source filtering, ie _source: ["id", "name"]
source filtering appears to be quite expensive and (AFAIK) does not benefit from filter caching - even if that were employed - since caches, as I understand, store references to whole docs as values (not post-source-filter contents, right?)
if I omit source (via _source: false) the query latency drops from 600ms to 4ms

If I am right, I am surprised that source filtering is so expensive in this case, given that I am filtering by exact property names (as opposed to wildcards), and we're only filtering 20 docs. is the bulk of cost in constructing a key-value map from _source on each query before applying source filtering?

Any way to lower cost of source filtering? Or do I need to create a separate type mapping - that contains only a subset of programs' properties relevant to this query?

Thanks,
-nikita

Topic		Replies	Views
'_source' filtering is slower than query without '_source' field Elasticsearch	8	928	June 21, 2023
Performance when fetching ids for large result set Elasticsearch	3	535	July 5, 2017
Ids query is faster very much than ids filter ,why? Elasticsearch	5	1230	July 5, 2017
Stored fields vs _source filtering in case of large source to fetch only one field while querying Elasticsearch	1	511	September 4, 2018
How expensive is the Source Filtering? Elasticsearch	5	1986	December 10, 2018

Why is IDs query slow?

Related topics