I have a situation and I'm trying to figure out if a parent-child setup is
the way to go. We have a document called a stream_item. It has some natural
attributes that are true no matter where we show that item. The item can
belong to more than one stream and within a stream it may have some
different attributes. One example of a synthetic attribute is a read/unread
flag. If a user marks an item as read within one stream, another user still
needs to see the item as unread. If the first user wants to search for all
unread items that match a particular phrase, I want to do a filtered search
in ES that filters for all documents that are unread when in the context of
this stream. I could certainly have a separate document per stream per
stream item, but I'm trying to avoid document duplication. Could the
"read/unread" status live in a stream specific child document? Would this
be slower than simply duplicating the document?
Lee
"It doesn't matter whether you are liberal or conservative, but it's
dangerous to always think with exclamation points instead of question
marks."
by Marty Beckerman
Yes, it can certainly be a child document. The has_child filter is more heavyweight then your "typical filter", you will need to run some perf tests to see that it works for you.
On Monday, March 28, 2011 at 11:05 PM, Lee Parker wrote:
I have a situation and I'm trying to figure out if a parent-child setup is the way to go. We have a document called a stream_item. It has some natural attributes that are true no matter where we show that item. The item can belong to more than one stream and within a stream it may have some different attributes. One example of a synthetic attribute is a read/unread flag. If a user marks an item as read within one stream, another user still needs to see the item as unread. If the first user wants to search for all unread items that match a particular phrase, I want to do a filtered search in ES that filters for all documents that are unread when in the context of this stream. I could certainly have a separate document per stream per stream item, but I'm trying to avoid document duplication. Could the "read/unread" status live in a stream specific child document? Would this be slower than simply duplicating the document?
Lee
"It doesn't matter whether you are liberal or conservative, but it's dangerous to always think with exclamation points instead of question marks."
by Marty Beckerman
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.