Michael, Karel, thank you both for your ideas! I had similar thoughts on
this issue. If at all possible, I'll store the security information in the
index. I just want to be prepared for the occasion that this won't be
possible. In that case, most likely a list of "authorized" roles would be
stored with each document in the index. At query time, for each possible
search result, I would have to ask the external module: "Does this user
have any of the 'authorized' roles on this document?"
As Karel mentions, doing this "post-search" brings a lot of usability
issues. That's why Solr's PostFilter looks appealing. The name is actually
slightly misleading, it's not a "post-search" filter of the kind mentioned.
Instead, it is described as "a mechanism to further filter documents after
they have already gone through the main query and other filters. This is
appropriate for filters with a very high cost."
(PostFilter (Solr 4.3.0 API))
As it's still done "in the search engine", facets, pagination/limit/offset
etc. should work as usual.
Perhaps my question then really is: What's the proper way of implementing a
custom non-caching filter for elasticsearch? And how to use it in a query
such that it is evaluated last?
Peter
Dňa sobota, 25. mája 2013 9:11:37 UTC+2 Karel Minařík napísal(-a):
If I understand correctly, you want to restrict people to see only the
documents they're allowed to see? First, as Michael writes, filtering the
returned results might severely impact the usability/experience for users
(no results etc.
I think the solution depeneds on how you embed the information into the
documents.
For instance, in the “each user must see only ‘their’ documents”, you
would simply add a user_id
field in the document, and filter on this
field, preferably with a filtered
query. For the “only people in ‘sales’
department can see these documents”, you'd use a similar approach,
embedding the department names/codes in the document; when the user
performs a search, you probably have information about departments they're
part of, and update the query accordingly.
If by “any form of index-side caching of the authorization information is
out of question” means that you want to filter the results in 100%
realtime, then I'm afraid your only solution is to perform a query, get
results, filter them, look if you've got enough or not, if not, repeat the
process. I have a bit of a hard time picturing this requirement being
accepted as reasonable.
Karel
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.