How to filter the buckets that have more than N documents using ElasticSearch DSL in python?

Ashar_Ahmad · March 10, 2023, 4:41am

I have an index in Elasticsearch that contains information of a user in each document, along with the facebook posts they have made (in a denormalized manner).

Each document contains: User_ID | User_Name | Post_Text | Post_Emojis

I want to retrieve the IDs of the users who have more than N posts.

I am new to using Elasticsearch, especially to Search DSL using python (Search DSL — Elasticsearch DSL 7.2.0 documentation)

I am creating buckets using the terms aggregation on the User_ID field, and want to filter the buckets based on the number of documents that fall inside each bucket.

This is the function I managed to create, however, as I'm unaware of the proper syntax, and am still confused with the documentation, I can't manage to execute it and attain the correct response.

def users_more_posts_than_query(search_object: Search, num_posts: int):
    search_object = search_object.aggs.bucket('posts_count', 'terms', field='user_id')\
        .pipeline("having_posts", "bucket_selector", buckets_path={"postsCount": "_count"}, script=f"params.postsCount > {num_posts}")

    response = search_object.execute()

    for hit in response.hits:
            hit.user_id

Please point out what I am doing wrong here, and how I can achieve my desired goal.

system · April 7, 2023, 4:42am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Getting the count of filtered buckets in elasticsearch Elasticsearch	1	546	December 20, 2019
Create query with list of users-id Elasticsearch	2	747	July 6, 2017
How to filter buckets based on the comparison of two sub-aggregation metrics in ElasticSearch (python)? Elasticsearch	1	193	April 9, 2023
Filter based on the doc_count with aggregations Elasticsearch	2	16838	July 5, 2017
Bucket selection (Elasticsearch 6.8) Elasticsearch	3	342	December 14, 2021

How to filter the buckets that have more than N documents using ElasticSearch DSL in python?

Related topics