after reading the docs and googling for a while I got the impression it
is not possible to ask ElasticSearch for a distinct values of a document
field. I know the terms facet but it won't help me to return 100,000
values. Right?
So is there another way to ask such kind of questions? At the moment I
see only two possibilites: a) SQL and a RDBMS or b) fetching all
relevant documents ordered by the desired field and kicking out
duplicates by hand.
I think for 100k, it is still OK to use a terms facet. This is going to
generate some network traffic but the issue would be the same with a RDBMS
anyway. However, I wouldn't recommend using sorting and kicking out
duplicates: this will either require deep paging or getting all results in
a single large page which are two things that would perform badly.
after reading the docs and googling for a while I got the impression it is
not possible to ask Elasticsearch for a distinct values of a document
field. I know the terms facet but it won't help me to return 100,000
values. Right?
So is there another way to ask such kind of questions? At the moment I see
only two possibilites: a) SQL and a RDBMS or b) fetching all relevant
documents ordered by the desired field and kicking out duplicates by hand.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.