Query on aggregation and scroll

sanjeebkdeka · December 11, 2018, 5:25pm

Hi,

I have a requirement to retrieve all instances of records having unique value for a particular column. All the records having that same unique value must appear in one cluster. The number of records in the index could be in billions.

Should i be using scroll with aggregation? I somewhere read aggregation is not the best solution for this one.

The other approach could be to scroll over those records and sort on that particular column. For this approach, i wanted to know whether the sorting will be over the 10000 records to be presented or all the matching records will be sorted first and then 10000 records will presented.

Please suggest.

xavierfacq · December 13, 2018, 9:46am

Hi,

Can you explain more the requirement ?

Do you need to scroll in order to extract those data ?

bye
Xavier

sanjeebkdeka · December 13, 2018, 10:17am

Yes, I need to scroll to extract those data.

Thanks

xavierfacq · December 13, 2018, 10:30am

You can run a filtered query on the particular column and then scroll results, but it can be long and heavy, depending on the number of "selected documents".
Note that sorting sorts over all records.

Extra question: Do you need to aggregate selected documents or not ?

sanjeebkdeka · December 13, 2018, 11:16am

Yes. I need aggregation over records matching my query.

xavierfacq · December 13, 2018, 1:38pm

Recently I "solved" a search dilemn with the following trick: We have complex searches with various parameters and they can return lot of records or not. My trick is to run the query with size = 0 to get the totalHits and then run a query with complex aggregations or to scroll and doing aggregations in our code.
The limit is fixed to 30000 records. Above this limit I let ES do aggregations, under this limit it's very quicker to do it ourself. Running queries with size=0 is super fast (< 100ms), but running the same query with complex aggregrations can take over 6/7s !

Note that we are running ES 2.4.

system · January 10, 2019, 1:49pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
What's the quickest way to extract a LARGE amount of records out of ES? Best practices for scroll API are welcome Elasticsearch	2	3045	July 5, 2017
Aggregation of more than 10000 records Elasticsearch	5	11035	September 20, 2018
Use search or scroll for large dataset which needs aggregations Elasticsearch	1	167	December 4, 2023
Scroll in ElasticSearch Aggregation Elasticsearch	7	10821	December 27, 2019
Elastic Search Pagination in Term Aggrigation Elasticsearch	10	560	January 3, 2019

Query on aggregation and scroll

Related topics