Hello, I have a requirement we need to implement and I really don´t know
how to do it or even if it is possible (maybe the solution is easy but I
just don´t know).
We have Person documents like this:
{
"id": 1,
"name": "Ernesto",
"NID": "AAA"
}
{
"id": 2,
"name": "Enrique",
"NID": "AAA"
}
{
"id": 3,
"name": "Antonio",
"NID": "BBB"
}
{
"id": 4,
"name": "Karlos",
"NID": "CCC"
}
Imagine we search for Persons with ["NID": "AAA"], we would get 2 documents
(id 1 and id 2).
If we search for Persons with ["NID": "BBB"], we would get 1 document (id
3).
And if we search for Persons with ["NID": "CCC"], we would get 1 document
(id 3).
Ok, so what we need is get all the documents in which the NID is unique. I
mean, the _count would be only 1 if we search for the NID of those
documents. The results from that query/filter would be documents 3 and 4,
since the NID value is unique for those documents (there are no more
documents in the index with that NID).
What you are probably looking for is field collapsing, which is not yet
supported in elasticsearch (it is planned). You can use a term facet to
retrieve the count for all terms and then do a separate query for each
unique term. In addition to the slowness of having to do multiple queries,
you would also face the issue of the facets not returning all the values,
especially on fields with many values. Not sure if the new aggregations
framework will help with this last part (I really need to try it out).
Hello, I have a requirement we need to implement and I really don´t know
how to do it or even if it is possible (maybe the solution is easy but I
just don´t know).
We have Person documents like this:
{
"id": 1,
"name": "Ernesto",
"NID": "AAA"
}
{
"id": 2,
"name": "Enrique",
"NID": "AAA"
}
{
"id": 3,
"name": "Antonio",
"NID": "BBB"
}
{
"id": 4,
"name": "Karlos",
"NID": "CCC"
}
Imagine we search for Persons with ["NID": "AAA"], we would get 2
documents (id 1 and id 2).
If we search for Persons with ["NID": "BBB"], we would get 1 document (id
3).
And if we search for Persons with ["NID": "CCC"], we would get 1 document
(id 3).
Ok, so what we need is get all the documents in which the NID is unique. I
mean, the _count would be only 1 if we search for the NID of those
documents. The results from that query/filter would be documents 3 and 4,
since the NID value is unique for those documents (there are no more
documents in the index with that NID).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.