I am trying to visualize some data and I am using elasticsearch sql to build some visualizations. I am importing data about 346M with 2000000 docs.
This is my slow sql:
select userId, count(*) from tableA where metricTime>0 group by userId order by cnt limit 10
but it cost 1min to get results.
select userId, count(*) from tableA where metricTime>0 group by userId
this query is fast, only 140ms.
This is there response pic:
The dataset has 597443 distinct userId(actually 600000 because of the HyperLogLog++ algorithm used in elasticsearch ) and 2000000 lines.
So, my question is why the elasticsearch sorts on aggregate to get Top-N results slow? Or it is naturally? Or how can I improve it ?