Filter on deeply nested data?


(David Haimson) #1

Our data is stored in MongoDB 2.4.8, and indexed to ElasticSearch 0.90.7
using the ElasticSearch MongoDB River 1.7.3.

Our data indexes correctly, and I can successfully search the fields we
want to search. But I also need to filter on permission - of course we only
want to return results the calling user can actually read.

In the code on our server, I have the calling user's authorizations as an
array, for example:

[ "Role:REGISTERED_USER", "Account:52c74b25da06f102c90d52f4", "Role:USER",
"Group:52cb057cda06ca463e78f0d7" ]

An example of the unit data we're searching follows:

{
"_id" : ObjectId("52dffbd6da06422559386f7d"),
"content" : "various stuff",
"ownerId" : ObjectId("52d96bfada0695fcbdb41daf"),
"acls" : [
{
"accessMap" : {},
"sourceClass" : "com.bulb.learn.domain.units.PublishedPageUnit",
"sourceId" : ObjectId("52dffbd6da06422559386f7d")
},
{
"accessMap" : {
"Role:USER" : {
"allow" : [
"READ"
]
},
"Account:52d96bfada0695fcbdb41daf" : {
"allow" : [
"CREATE",
"READ",
"UPDATE",
"DELETE",
"GRANT"
]
}
},
"sourceClass" : "com.bulb.learn.domain.units.CompositeUnit",
"sourceId" : ObjectId("52dffb54da06422559386f57")
}
]
}

In the sample data above, I have replaced all the searchable content with
"content" : "various stuff"

The authorization data is in the "acls" array. The filter I need to write
would do the following (in English):

pass all units where the "acls" array
contains an "accessMap" object
that contains a property whose name is one of the user's authorization 

strings
and whose "allow" property contains "READ"
and whose "deny" property does not contain "READ"

In the example above, the user has "Role:USER" authorization, and this unit
has an accessMap that has "Role:USER", which contains "allow", which
contains "READ", and "Role:USER" contains no "deny". So this unit would
pass the filter.

I am not seeing how to write a filter for this using ElasticSearch.

I get the impression that there are two ways to deal with nested arrays
like this: "nested", or "has_child" (or "has_parent").

We are reluctant to use the "nested" filter because it apparently requires
that the whole block be re-indexed when any of the data changes. Searchable
content and authorization data can change at any time, in response to user
actions.

It looks to me as though in order to use "has_child" or "has_parent", the
authorization data would have to be separate from the unit data (in a
different collection?), and when a node is indexed, it would have to have
its parent or child specified. I don't know whether the ElasticSearch
MongoDB River is capable of doing this.

So is this even possible? Or should we rearrange the authorization data?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9aedee9f-5cf1-4e23-908a-8ceefa9b3493%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Hendrik) #2

Maybe this helps: https://github.com/salyh/elasticsearch-security-plugin

Am Donnerstag, 23. Januar 2014 17:37:20 UTC+1 schrieb David Haimson:

Our data is stored in MongoDB 2.4.8, and indexed to ElasticSearch 0.90.7
using the ElasticSearch MongoDB River 1.7.3.

Our data indexes correctly, and I can successfully search the fields we
want to search. But I also need to filter on permission - of course we only
want to return results the calling user can actually read.

In the code on our server, I have the calling user's authorizations as an
array, for example:

[ "Role:REGISTERED_USER", "Account:52c74b25da06f102c90d52f4", "Role:USER",
"Group:52cb057cda06ca463e78f0d7" ]

An example of the unit data we're searching follows:

{
"_id" : ObjectId("52dffbd6da06422559386f7d"),
"content" : "various stuff",
"ownerId" : ObjectId("52d96bfada0695fcbdb41daf"),
"acls" : [
{
"accessMap" : {},
"sourceClass" :
"com.bulb.learn.domain.units.PublishedPageUnit",
"sourceId" : ObjectId("52dffbd6da06422559386f7d")
},
{
"accessMap" : {
"Role:USER" : {
"allow" : [
"READ"
]
},
"Account:52d96bfada0695fcbdb41daf" : {
"allow" : [
"CREATE",
"READ",
"UPDATE",
"DELETE",
"GRANT"
]
}
},
"sourceClass" : "com.bulb.learn.domain.units.CompositeUnit",
"sourceId" : ObjectId("52dffb54da06422559386f57")
}
]
}

In the sample data above, I have replaced all the searchable content with
"content" : "various stuff"

The authorization data is in the "acls" array. The filter I need to write
would do the following (in English):

pass all units where the "acls" array
contains an "accessMap" object
that contains a property whose name is one of the user's authorization 

strings
and whose "allow" property contains "READ"
and whose "deny" property does not contain "READ"

In the example above, the user has "Role:USER" authorization, and this
unit has an accessMap that has "Role:USER", which contains "allow", which
contains "READ", and "Role:USER" contains no "deny". So this unit would
pass the filter.

I am not seeing how to write a filter for this using ElasticSearch.

I get the impression that there are two ways to deal with nested arrays
like this: "nested", or "has_child" (or "has_parent").

We are reluctant to use the "nested" filter because it apparently requires
that the whole block be re-indexed when any of the data changes. Searchable
content and authorization data can change at any time, in response to user
actions.

It looks to me as though in order to use "has_child" or "has_parent", the
authorization data would have to be separate from the unit data (in a
different collection?), and when a node is indexed, it would have to have
its parent or child specified. I don't know whether the ElasticSearch
MongoDB River is capable of doing this.

So is this even possible? Or should we rearrange the authorization data?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7624eeab-c0ba-4554-9c8e-a454add6d0d1%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #3