Terms lookup - broken by design

Hi,
I need to exclude a large list of userids from a search request and want to optimze that by using a seperate index with all the userids that need to be skipped.

https://www.elastic.co/guide/en/elasticsearch/reference/6.0/query-dsl-terms-query.html

want to use the solution explained here :


Using a "skipped" user index.

Problem is that the list of skipped users can get very large, for example 30K ids.. so when adding that userid to the "skipped" index,
it will always result in a complete reindex of that document => high IO. . slow index over time.

Aren't there plans to adjust this terms lookup to a search rather then a single document lookup?
Would be more logical that for every "skip" a new document is created instead of all storing it in one document..

Or are there other solutions that I haven't thought of?

Thx J

Thousands of unique ids (regardless of where you gather them from) = slow queries if you assume spinning media and their average seek times.

For now I've always used the ids filter and fetched them from an external source (redis/ memcached.. with fallback on DB) and passed them along in the query request in ES 2.x ..

Still it's something that needs to be updated/ maintained in an extra system..

Was now looking to prepare upgrade to 5.x or 6.x and wanted to make my queries a bit more clean and only use one system,but if there's not a good solution for this by just using elasticsearch, I'll continue to use the ids filter with depedency on an external system for passing the bigger lists.

Might be worth something to explore on the roadmap for elastic,
think I'm not the only one with the need to cross join on different indexes efficiently..
(outer join in this case)

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.