Why is IDs query slow?

Thanks for all the suggestions thus far, folks. I think the issue lies in the cost of doing source filtering on matching docs - either via _source= or fields= The challenge now is figuring out the best way to minimize that cost.

Let me step back:

  • /programs has many properties, amongst them id, name, and areas. The latter is an array and is large - on the order of 40K entries for some docs
  • I am interested in retrieving only id, name (and a few other small properties) of docs matching the IDs query. I achieve that via source filtering, ie _source: ["id", "name"]
  • source filtering appears to be quite expensive and (AFAIK) does not benefit from filter caching - even if that were employed - since caches, as I understand, store references to whole docs as values (not post-source-filter contents, right?)
  • if I omit source (via _source: false) the query latency drops from 600ms to 4ms

If I am right, I am surprised that source filtering is so expensive in this case, given that I am filtering by exact property names (as opposed to wildcards), and we're only filtering 20 docs. is the bulk of cost in constructing a key-value map from _source on each query before applying source filtering?

Any way to lower cost of source filtering? Or do I need to create a separate type mapping - that contains only a subset of programs' properties relevant to this query?

Thanks,
-nikita