Is there any particular reason why has_child filter does not return
children IDs, possibly behind an option? I mean, surely it must find those
in order to return a matching parent...
I saw this github issue - https://github.com/elasticsearch/elasticsearch/issues/2744 - where
(apparently?) it used to return children ids and then stopped.
I'm currently looking into patching my ES installation in order to return
children IDs (yes, I can get them by individually asking for all children
of all returned parent documents and then querying those, but it turns one
query (returning a batch of, say, 20) into 21 queries.
The issue you referred to is actually saying that child ids shouldn't be
collected. It makes sense for has_child to remember only the _parent field.
It only needs those to filter the parent query.
If you really want to know the child IDs , you might be able to use a
scoped facet.
You might run in to OOM exceptions if you have a lot of child IDs.
Is there any particular reason why has_child filter does not return
children IDs, possibly behind an option? I mean, surely it must find those
in order to return a matching parent...
I saw this github issue - has_child returns parent and child · Issue #2744 · elastic/elasticsearch · GitHub - where
(apparently?) it used to return children ids and then stopped.
I'm currently looking into patching my ES installation in order to return
children IDs (yes, I can get them by individually asking for all children
of all returned parent documents and then querying those, but it turns one
query (returning a batch of, say, 20) into 21 queries.
I'm currently looking into patching my ES installation in order to
return children IDs (yes, I can get them by individually asking for
all children of all returned parent documents and then querying those,
but it turns one query (returning a batch of, say, 20) into 21 queries.
Why can't combine the results of the 1st query (returning 1) to generate
1 larger 2nd query (returning 20)?
I'm new in the search field, but to be honest that behaviour does not
make sense to me.
I've referred to the issue because to me it seemed like after the patch ES
stopped returning (potentially needed) information, without any flag to ask
for it.
I'll try to explain myself: ES has to know that parent P has the child C.
Therefore, it has to find a child C first. Therefore, it should know the
child ID at the time of query (with cold cache, but child ID could be
cached as well).
So to me it seems like it throws away the information which is potentially
needed by the callee even though it has it.
Please correct me if I'm wrong.
Could you please elaborate on the scoped facet approach? I've seen the
passing mention in the documentation, but could not figure out how to use
it to solve my problem.
@Paul:
I've tried doing that, yet I haven't found the way.
Suppose ES gives me 10 ids. I can query ES to return me all documents which
have those parent ID: just do an analogue of "WHERE parent_id IN (...)
LIMIT 10" query.
Yet that does not guarantee to give me one child per each parent document
--- it could give me 10 child documents for the first parent I have.
Is there a better way?
I'm currently looking into patching my ES installation in order to return
children IDs (yes, I can get them by individually asking for all children
of all returned parent documents and then querying those, but it turns one
query (returning a batch of, say, 20) into 21 queries.
Why can't combine the results of the 1st query (returning 1) to generate 1
larger 2nd query (returning 20)?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.