Hi Guys,
I'm a new ElasticSearch user, and need some help selecting only documents
with a distinct field. I've got an index of documents which look something
like the following:
{
"last_name": "Degges",
"first_name": "Randall",
"state": "CA",
}
I'm trying to return a list of de-duplicated documents which contain the
same last name. For instance -- if I had three documents, two with a
"last_name" field of "Degges" and one with a "last_name" field of "Perez",
I'd want to return only two documents: one document where "Degges" is the
last name (I don't care which), and one where "Perez" is the last name.
I realize that this question has been asked before on this mailing list,
but even after reading through the documentation on facets, asking on the
IRC channel, and doing lots of trial-and-error testing, I can't figure out
how to make it work.
I'm hoping some of you can give me specific example queries I can use to do
the de-duplication.
The reason I'm attempting to do this is that my application needs to return
a list of unique last names for people in a given state (e.g. CA). Since I
have so many people in my index, it would be incredibly slow for me to
select all of the people at once (with duplicates), and de-duplicate things
on my end
Any help would be greatly appreciated.
Thank you.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.