ES|QL fails with 'Data too large message for a small dataset

GlebCA · March 19, 2025, 1:27pm

I am running the following ES|QL query against 192M documents

from filebeat-audit-* 
| where user.name is not null AND source.ip is not null and host.name is not null 
| where event.outcome == "success" and cidr_match(source.ip,"2.56.0.0/16") 
| stats cnt = count_distinct(source.ip) by user.name 
| keep cnt, user.name 
| where cnt > 1
| limit 10000

and it returns error:
[esql] > Unexpected error from Elasticsearch: circuit_breaking_exception - [request] Data too large, data for [<reused_arrays>] would be [21654658976/20.1gb], which is larger than the limit of [18038862643/16.7gb]

but if I remove stats - it returns only 305 documents (source.ip matches 2.56.0.0/16)... why Elastic cannot run stats on it?

from filebeat-audit-* 
| where user.name is not null AND source.ip is not null and host.name is not null 
| where event.outcome == "success" and cidr_match(source.ip,"2.56.0.0/16") 
| limit 10000

GlebCA · March 19, 2025, 2:49pm

Found workaround - add an intermediate stat

from filebeat-audit-* 
| where user.name is not null and source.ip is not null and host.name is not null and event.outcome=="success" and cidr_match(source.ip,"2.56.0.0/16") 
| STATS SUM(1) BY user.name,source.ip 
| STATS cnt = count_distinct (source.ip) by user.name
| keep cnt, user.name
| WHERE cnt > 1 
| limit 10000

RainTown · March 19, 2025, 10:03pm

It's good that you found a workaround, but still the original q is interesting.

Would you/others consider it a bug? In your case the original ES|QL query failed, but had you had maybe less than 192M documents it might have "worked", but still consumed a lot of resources temporarily (and arguably un-necessarily).

stephenb · March 20, 2025, 3:56am

What version?

GlebCA · March 20, 2025, 2:29pm

Elastic 8.15.5

stephenb · March 20, 2025, 5:24pm

Could you try 8.17.3?

There may have been some issues with memory and ESQL in that version but I can not readily find it.

Topic		Replies	Views
ES\|QL and the _size field Elasticsearch esql	1	24	February 19, 2025
Getting error after adding filebeat Beats filebeat	1	118	November 2, 2023
Failed execution of ESQL query and high cpu load Elastic Security	16	741	December 12, 2023
Finding out cause of circuit breaking exception (Data too large) Elasticsearch	5	6767	December 4, 2024
DSL query match doesnt work Elasticsearch	4	406	July 1, 2019

ES|QL fails with 'Data too large message for a small dataset

Related topics