Rally dump existing cluster data while 429 occur

when I try to create a track from data in an existing cluster, I found that error occurred like following

Extracting documents for index [compress_ratio_ac...     1650000/50000001 docs [3.3% done][ERROR] Cannot create-track. TransportError(429, 'circuit_breaking_exception', '[parent] Data too large, data for [<http_request>] would be [1702314100/1.5gb], which is larger than the limit of [1690094796/1.5gb], real usage: [1702313936/1.5gb], new bytes reserved: [164/164b], usages [request=0/0b, fielddata=0/0b, in_flight_requests=154112090/146.9mb, accounting=0/0b]').

how can I solve this problem if I don't set ES config indices.fielddata.cache.size, rally can do something to avoid this situation?

Hi,

extracting all data from a cluster puts a heavier burden than usual operation on the cluster. Looking at the circuit breaker exception I guess that you limit heap size to ~ 2GB. I suggest you temporarily allocate more heap memory to Elasticsearch when extracting data.

Daniel

1 Like

thanks Daniel,
maybe the cluster has other pressure of writing data when I extracting data from it, which cause the JVM memory to be insufficient, i'll try later, but there is no way to solve it by configuring rally right? perhaps only configure ES can avoid this if i have limited JVM memory

Dzp

Hi,

no, it's unfortunately not configurable. Internally, Rally uses the Python client's scan helper which fetches documents in batches of size 1000.

Daniel

thanks again Daniel,

it's so helpful to me, glad to talk with you