Questions about mapping a tree structure


(Matt Schoen) #1

Hi there. I have a document type that contains a lot of self-similar data. Here's a little sample to give you an idea:

Root:
Motif:
Motif Properties
Array of sectionInfo
Children(motif):
Motif Properties
Array of sectionInfo
Other motifs
Other root properties

So... I hope that makes sense. I can include some sample documents if needed, but they're pretty big and complicated. The two mappings that will come up often are motif and sectionInfo, which are classes in my program.

My question is whether the motif object should be a nested type. The documentation said something about nested objects being stored as separate documents, but since the structure doesn't mirror the structure of the whole document, I'm not sure that's what I want.

My other thought was to just make a variable in my mapping setup script to represent those structures and reuse wherever needed. This still doesn't solve the problem if the motif having children who have children, etc.

Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Adrien Grand) #2

Hi,

It really depends on how recursive your tree structure is. The underlying
storage of Elasticsearch is flat and doesn't allow for storing deeply
recursive tree structures. There are two ways to work around this flat
structure (which brings lots of benefits otherwise): using nested
documents, which consist of indexing a parent document and its children
sequentially on disk so that scoring from/to parents/children or faceting
can be done efficiencly, and parent/child relationships where the mapping
between parents and children is stored in memory on top of the index
without any constraint on the on-disk storage. But in both cases, this only
allows for storing 1->n relationships and doesn't allow for recursivity.

So storing a tree structure in Elasticsearch might be challenging. But it
also depends on how you plan to query your data. For example, if you don't
need scores to propagate from children to parents or vice-versa, you could
just index all motifs independently.

On Wed, Sep 4, 2013 at 9:44 PM, Matt Schoen mtschoen@gmail.com wrote:

Hi there. I have a document type that contains a lot of self-similar data.
Here's a little sample to give you an idea:

Root:
Motif:
Motif Properties
Array of sectionInfo
Children(motif):
Motif Properties
Array of sectionInfo
Other motifs
Other root properties

So... I hope that makes sense. I can include some sample documents if
needed, but they're pretty big and complicated. The two mappings that will
come up often are motif and sectionInfo, which are classes in my program.

My question is whether the motif object should be a nested type. The
documentation said something about nested objects being stored as separate
documents, but since the structure doesn't mirror the structure of the
whole document, I'm not sure that's what I want.

My other thought was to just make a variable in my mapping setup script to
represent those structures and reuse wherever needed. This still doesn't
solve the problem if the motif having children who have children, etc.

Thanks.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Matt Schoen) #3

Hm. I haven't heard of faceting. I'll look into that. to be honest I'll probably just be calling these documents up based on an I'd stored in the root, so below that level, the data may as well be a string. Is there a way to just say "don't worry about the structre here"?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Adrien Grand) #4

Hi,

On Wed, Sep 4, 2013 at 11:41 PM, Matt Schoen mtschoen@gmail.com wrote:

Hm. I haven't heard of faceting. I'll look into that. to be honest I'll
probably just be calling these documents up based on an I'd stored in the
root, so below that level, the data may as well be a string. Is there a way
to just say "don't worry about the structre here"?

This is what happens by default: documents are flattened. If you look at
the first example on [1], queries to Elasticsearch won't be able to know
whether count=4 is associated with blue or green. To Elasticsearch, it will
be as if there was a field called "name" which has two values "blue" and
"green" and a field called "count" which has two values "4" and "6".

[1] http://www.elasticsearch.org/guide/reference/mapping/nested-type/

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #5