Python3 + elasticsearch_dsl good resources/examples simple query problem

i'm fighting to get over the hump of gaining competence with elasticsearch_dsl for python.

i'm trying to do my end of the year (2020) reports.. and i have a set of indexes that total about 3/4 of a billion records. i need all the unqiue values and the count of those values.

using the dev interface its a simple query.. but you cant get all the results easily.

POST /lookout-hp*/_search?size=0
{
"size": 0,
"aggs" : {
"langs" : {
"terms" : { "field" : "password.keyword", "size" : 50000 }
}
}}

results:
{
"took" : 4221,
"timed_out" : false,
"_shards" : {
"total" : 115,
"successful" : 115,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 10000,
"relation" : "gte"
},
"max_score" : null,
"hits" :
},
"aggregations" : {
"langs" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 376068,
"buckets" : [
{
"key" : "password",
"doc_count" : 349639
},
{
"key" : "admin",
"doc_count" : 254823
},
{
"key" : "123456",
"doc_count" : 200632
},
{
"key" : "",
"doc_count" : 186228
},
{
"key" : "1234",
"doc_count" : 110466
},
{
"key" : "root",
"doc_count" : 92418
},
...

how do i do this in elasticsearch_dsl and python to get all the results? i cant find any good examples.
are there any other good resources? online or a book?

thank you

i got the responses i want with this python code:

    s = Search(using=client, index="lookout-hp-*")
    s.aggs.bucket('per_tag','terms', field='password.keyword')
    response=s.execute()
    s_Data=s.to_dict()
    print (s_Data)
    r_Data=response.aggregations.per_tag.to_dict()
    print (r_Data)

i have millions of responses.. and i know i need to page or scan through them to get them all.. but i sure cant find an example of that.. any suggestions would be greatly appreciated.

got it.. just didnt know the right terms .. i needed a "composite aggregation" the link below works perfectly.

Thanks for sharing your solution, and good luck! :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.