Get all distincts values of a field ( more than 10k values)

humartinez · August 13, 2019, 9:01pm

Hi there,
Perhaps this question was asked many times but I haven't seen an answer that fits me. Im looking for the best way to get all the distincts values of a field in a group of indices and I manage to create a script in order to do this but It doesn't feel right to me so Im asking help form the experts.

partitions=25
rm run*
for (( i=0;i<$partitions;i++))
do
  curl -s -u user:password 'http://10.x.x.x:9200/index-*'/_search?pretty -H 'Content-Type: application/json' -d"
  {
     \"size\": 0,
     \"aggs\": {
        \"expired_sessions\": {
           \"terms\": {
              \"field\": \"data.device.deviceid\",
              \"include\": {
                 \"partition\": $i,
                 \"num_partitions\": $partitions
              },
              \"size\": 10000
           }
        }
     }
  }
  " > run.$i
done
cat run*|jq .aggregations.expired_sessions.buckets[].key

In fact the number of devices is differs when I run a cardinality query

GET index-*/_search
{
  "size": 0,
  "aggs": {
    "count": {
      "cardinality": {
        "field": "data.device.deviceid"
      }
    }
  }
}

The cardinality query retuns 62102 vs the term agg that returns 61990

Mark_Harwood · August 13, 2019, 9:17pm

A couple of points:

the cardinality aggs is, by design, approximate - see the docs
if you don’t need to sort the terms by a child agg the ‘composite’ agg is probably simpler than using the ‘terms’ aggregation with partitioning

humartinez · August 16, 2019, 2:55pm

Colud you please give me an example of composite agg to do this, I can't figure It out

Mark_Harwood · August 16, 2019, 3:04pm

The after param is what allows you to page the composite agg.

system · September 13, 2019, 3:04pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Need all distinct values .. Elasticsearch returns 1000 only Elasticsearch	1	408	January 21, 2020
Aggregation - Calculate the number of distinct values Elasticsearch	5	431	January 3, 2019
Running cardinality for more than 10000 buckets Elasticsearch	14	2867	August 28, 2019
Query DSL count distinct Elasticsearch	16	4211	April 13, 2022
How to get all unique values of a field for a single index? Elasticsearch	1	1558	February 14, 2020

Get all distincts values of a field ( more than 10k values)

Related topics