Clarification on has_child filter memory requirements

Based on the official docs (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-has-child-filter.html):

{quote}
memory considerations

With the current implementation, all _parent field values and all _id field values of parent documents are loaded into memory (heap) via field data in order to support fast lookups, so make sure there is enough memory for it.
{/quote}

Does this mean that all the parent docs will be loaded into memory or the ones matching the filter? If the former is true, then it would mean that one should keep the size of the parent objects to minimum, right? In addition, say has_child is a part of a conjunction (regular filter AND has_child), would ES still load all the parent docs, or only the ones that matched the first filter?

Thanks,

Drew

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/FE901831-FB74-4F89-A313-16C1C08BF0A5%40venarc.com.
For more options, visit https://groups.google.com/d/optout.

Hey,

not all parent documents (and not the data), just their ids. Still this can
accumulate, which is the reason why you should monitor the size of that
data structure (exposed in the nodes stats).

Hope that helps.

--Alex

On Thu, Jun 19, 2014 at 6:03 AM, Drew Kutcharian drew@venarc.com wrote:

Based on the official docs (
Elasticsearch Platform — Find real-time answers at scale | Elastic
):

{quote}
memory considerations

With the current implementation, all _parent field values and all _id
field values of parent documents are loaded into memory (heap) via field
data in order to support fast lookups, so make sure there is enough memory
for it.
{/quote}

Does this mean that all the parent docs will be loaded into memory or the
ones matching the filter? If the former is true, then it would mean that
one should keep the size of the parent objects to minimum, right? In
addition, say has_child is a part of a conjunction (regular filter AND
has_child), would ES still load all the parent docs, or only the ones that
matched the first filter?

Thanks,

Drew

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/FE901831-FB74-4F89-A313-16C1C08BF0A5%40venarc.com
https://groups.google.com/d/msgid/elasticsearch/FE901831-FB74-4F89-A313-16C1C08BF0A5%40venarc.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM-%3Dvbk3BkFQBbuXybg_-QX%3DEj6Rou2QMzqbzXUsbYJV8w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks Alex. What do you mean by "not all parent documents (and not the data), just their ids" what decides what which parent document ids get loaded? Also, this ids that get loaded are per query or they stay around longer? I ask because in our use case we're going to keep adding more and more parents and children.

  • Drew

On Jun 20, 2014, at 12:04 AM, Alexander Reelsen alr@spinscale.de wrote:

Hey,

not all parent documents (and not the data), just their ids. Still this can accumulate, which is the reason why you should monitor the size of that data structure (exposed in the nodes stats).

Hope that helps.

--Alex

On Thu, Jun 19, 2014 at 6:03 AM, Drew Kutcharian drew@venarc.com wrote:
Based on the official docs (Elasticsearch Platform — Find real-time answers at scale | Elastic):

{quote}
memory considerations

With the current implementation, all _parent field values and all _id field values of parent documents are loaded into memory (heap) via field data in order to support fast lookups, so make sure there is enough memory for it.
{/quote}

Does this mean that all the parent docs will be loaded into memory or the ones matching the filter? If the former is true, then it would mean that one should keep the size of the parent objects to minimum, right? In addition, say has_child is a part of a conjunction (regular filter AND has_child), would ES still load all the parent docs, or only the ones that matched the first filter?

Thanks,

Drew

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/FE901831-FB74-4F89-A313-16C1C08BF0A5%40venarc.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM-%3Dvbk3BkFQBbuXybg_-QX%3DEj6Rou2QMzqbzXUsbYJV8w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/E4598079-47FD-4B49-BE88-A0AE75E98622%40venarc.com.
For more options, visit https://groups.google.com/d/optout.

I've updated the docs on memory usage with parent-child. Hopefully more
understandable:

On 21 June 2014 07:32, Drew Kutcharian drew@venarc.com wrote:

Thanks Alex. What do you mean by "not all parent documents (and not the
data), just their ids” what decides what which parent document ids get
loaded? Also, this ids that get loaded are per query or they stay around
longer? I ask because in our use case we’re going to keep adding more and
more parents and children.

  • Drew

On Jun 20, 2014, at 12:04 AM, Alexander Reelsen alr@spinscale.de wrote:

Hey,

not all parent documents (and not the data), just their ids. Still this
can accumulate, which is the reason why you should monitor the size of that
data structure (exposed in the nodes stats).

Hope that helps.

--Alex

On Thu, Jun 19, 2014 at 6:03 AM, Drew Kutcharian drew@venarc.com wrote:

Based on the official docs (
Elasticsearch Platform — Find real-time answers at scale | Elastic
):

{quote}
memory considerations

With the current implementation, all _parent field values and all _id
field values of parent documents are loaded into memory (heap) via field
data in order to support fast lookups, so make sure there is enough memory
for it.
{/quote}

Does this mean that all the parent docs will be loaded into memory or the
ones matching the filter? If the former is true, then it would mean that
one should keep the size of the parent objects to minimum, right? In
addition, say has_child is a part of a conjunction (regular filter AND
has_child), would ES still load all the parent docs, or only the ones that
matched the first filter?

Thanks,

Drew

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/FE901831-FB74-4F89-A313-16C1C08BF0A5%40venarc.com
https://groups.google.com/d/msgid/elasticsearch/FE901831-FB74-4F89-A313-16C1C08BF0A5%40venarc.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGCwEM-%3Dvbk3BkFQBbuXybg_-QX%3DEj6Rou2QMzqbzXUsbYJV8w%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGCwEM-%3Dvbk3BkFQBbuXybg_-QX%3DEj6Rou2QMzqbzXUsbYJV8w%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/E4598079-47FD-4B49-BE88-A0AE75E98622%40venarc.com
https://groups.google.com/d/msgid/elasticsearch/E4598079-47FD-4B49-BE88-A0AE75E98622%40venarc.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPt3XKT6BZpigfSgSX%3DtEUM-JTUmLHWKKQ3AQwWKNyQ2Og3HGA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks Clinton. One thing that's still a bit unclear is "how" these doc ids get loaded and stay loaded. Mainly, if you have a use case where you keep adding parent docs, would ES keep updating the cache per insert or per has_child query time?

On Jun 21, 2014, at 7:51 AM, Clinton Gormley clint@traveljury.com wrote:

I've updated the docs on memory usage with parent-child. Hopefully more understandable:

Elasticsearch Platform — Find real-time answers at scale | Elastic

On 21 June 2014 07:32, Drew Kutcharian drew@venarc.com wrote:
Thanks Alex. What do you mean by "not all parent documents (and not the data), just their ids" what decides what which parent document ids get loaded? Also, this ids that get loaded are per query or they stay around longer? I ask because in our use case we're going to keep adding more and more parents and children.

  • Drew

On Jun 20, 2014, at 12:04 AM, Alexander Reelsen alr@spinscale.de wrote:

Hey,

not all parent documents (and not the data), just their ids. Still this can accumulate, which is the reason why you should monitor the size of that data structure (exposed in the nodes stats).

Hope that helps.

--Alex

On Thu, Jun 19, 2014 at 6:03 AM, Drew Kutcharian drew@venarc.com wrote:
Based on the official docs (Elasticsearch Platform — Find real-time answers at scale | Elastic):

{quote}
memory considerations

With the current implementation, all _parent field values and all _id field values of parent documents are loaded into memory (heap) via field data in order to support fast lookups, so make sure there is enough memory for it.
{/quote}

Does this mean that all the parent docs will be loaded into memory or the ones matching the filter? If the former is true, then it would mean that one should keep the size of the parent objects to minimum, right? In addition, say has_child is a part of a conjunction (regular filter AND has_child), would ES still load all the parent docs, or only the ones that matched the first filter?

Thanks,

Drew

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/FE901831-FB74-4F89-A313-16C1C08BF0A5%40venarc.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM-%3Dvbk3BkFQBbuXybg_-QX%3DEj6Rou2QMzqbzXUsbYJV8w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/E4598079-47FD-4B49-BE88-A0AE75E98622%40venarc.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPt3XKT6BZpigfSgSX%3DtEUM-JTUmLHWKKQ3AQwWKNyQ2Og3HGA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/10CD9D82-9686-4931-BFA5-B476AFC158C3%40venarc.com.
For more options, visit https://groups.google.com/d/optout.