Can ES do a complex aggregation with WHERE and GROUP BY + ORDER BY like in MySQL

Hi, I have a query like this in MySQL that I am thinking to migrate to ElasticSearch for scalability:

SELELCT userID,sum(countA) as sumA,sum(countB) as sumB
    FROM data_table 
 WHERE groupID=? AND date BETWEEN ? AND ? 
 GROUP BY userID 
 ORDER BY sumA 
LIMIT ?,?

For a table of id, groupID, date, userID, countA, countB with an index of <groupID, date>.

Is it possible to do it in ES? The closest featuer I can find is composite aggregation: Composite aggregation | Elasticsearch Reference [7.11] | Elastic

It seems supporting aggregation(group by), pagination(limit).
I guess sum(countA) can be done by sub-aggregation?

But it doesn't seem to support filtering by date/groupID.
Any suggestion?

Yes.
Look at query dsl
Specifically, match for your groupID=?
range for your date between ? AND ?
sort for order by.
Cheers!

but it doesn't seem to support GROUP BY, did I miss something?
(Edited the subject of the post to be more clear)

you can.
Look at the following

And Terms Aggregation

HAVING is different from WHERE.

WHERE will do before grouping but HAVING is after.

Is it possible to apply the filtering before the aggregation?

@Long_Quanzheng
All your requirements can be fulfilled with ELK.
I suggest that you will download, install and configure an instance and have a go.

Thanks!
I tested this SQL seems working:

curl -X GET "localhost:9200/_sql?format=txt&pretty=true" -H 'Content-Type: application/json' -d'
{
  "query" :"SELECT email,sum(countA) FROM \"library\" GROUP BY email LIMIT 2",
  "filter" :{
    "range" :{
       "number":{
         "gte":1,
         "lte":100,
       }
    }
  }
}
'

BTW, do you know how can I get the DSL version of this query?
I am very curious how the aggregation DSL works for this because the documentation doesn't have those feature working together.

Why not:

curl -X GET "localhost:9200/_sql?format=txt&pretty=true" -H 'Content-Type: application/json' -d'
{
  "query" :"SELECT email,sum(countA) FROM \"library\" WHERE \"number\" between 1 and 100 GROUP BY email LIMIT 2"
}
'

and you can use the sql translate api to see the query dsl.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.