Hi there,
Perhaps this question was asked many times but I haven't seen an answer that fits me. Im looking for the best way to get all the distincts values of a field in a group of indices and I manage to create a script in order to do this but It doesn't feel right to me so Im asking help form the experts.
partitions=25
rm run*
for (( i=0;i<$partitions;i++))
do
curl -s -u user:password 'http://10.x.x.x:9200/index-*'/_search?pretty -H 'Content-Type: application/json' -d"
{
\"size\": 0,
\"aggs\": {
\"expired_sessions\": {
\"terms\": {
\"field\": \"data.device.deviceid\",
\"include\": {
\"partition\": $i,
\"num_partitions\": $partitions
},
\"size\": 10000
}
}
}
}
" > run.$i
done
cat run*|jq .aggregations.expired_sessions.buckets[].key
In fact the number of devices is differs when I run a cardinality query
GET index-*/_search
{
"size": 0,
"aggs": {
"count": {
"cardinality": {
"field": "data.device.deviceid"
}
}
}
}
The cardinality query retuns 62102 vs the term agg that returns 61990