Is it possible to get an exact term count on the number for location matches within one document. Eg
Say I have location (geo_point) indexed documents such as
Document. 1 has location:[latlon_1, latlon_2]
Document 2 has location :[latlon_3, latlon_4]
...
I want to do a geo distance aggregation filter on a particular radius and return counts of documents that match for example atleast 2 locations per document.
This means we have to somehow get how many matches per document not just if any matched.
Those locations exist in an array of nested documents. Internally Nested docs are indexed separately from what i gather. i don't know it this helps.
The query we are trying to achieve is find out documents where location a and location b happened exactly one time.
It is too costly to split those documents. Also having all location coalesced per document gives us the ability to do location intersection kind of queries.
Could this be achieved using script or custom query. I am willing to investigate/implement if this is possible and kindly point me to the right direction.
It is costly because these documents are in the billions and multiple terabytes and breaking it up will explode it even more plus we lose functionality that way.. It is simply not viable as a pipeline to hold multiple representation of it. It would be great if there are something custom I could do.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.