I have a query that collapses on a field representing a hash that can at most be shared between two entries. What I need to do is via a post filter (or alternative) remove the results from the final list whereby the inner hits total is 1 and not 2, however post filter can not find the inner hits for each entry and hence the total is not available.
What would be the best way to filter results whereby they only have one entry within the collapse inner hits ?
failed to create query: [nested] failed to find nested object under path [same_hash]
FYI it is not the contents of the inner hits I am interested in but rather the total which will allow me to filter out the parent for less than two inner hits
So I pulled one of the mappings but had to strip out certain elements for security reasons but fundamentally the contents is wrapped the same. I am using Elasticsearch v7.10.
Sorry I've not had time to have a play yet to figure it out. It looks like the post_filter approach in the referenced post is available from 7.10.
I see the type of someHash is keyword rather than nested which surprised me slightly. Can you give me an example of the data structure (sanitized obviously!)?
Nested is only used for arrays to allow them to be searched independently but what I am trying to do is query the inner_hits total value which happens as part of the collapse. If I were able to query this total (inner_hits.same_hash.hits.total.value) then I would be able to filter out what I need but I am unable to do this. It seems that inner_hits is not available to the post_filter in this way. So I need a solution to the above.
I suppose the question first is does the post filter run against the results post collapse ?
Thanks for confirming @codeMonkey82 . I've not been able to get the post_filter to work either. But I was wondering if you had considered using a terms aggregation combined with a bucket_selector to filter the counts instead of using collapse?
The solution needs to take into pagination and sorting of the final result set so using buckets might not work as expected, I do appreciate the response thought so thank you
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.