APM Python: request body too large

Kibana version: 7.17

Elasticsearch version: 7.17

APM Server version: 7.15

APM Agent language and version: Python, v6.3.3

Hi,

it happens a few times a day that the application throws an exception that can't be inspected in Kibana. In the logs an error is complaining that the request body was too large:

Mai 11 04:25:03 ****** apm-server[9995]: {"log.level":"error","@timestamp":"2022-05-11T04:25:03.736Z","log.logger":"request","log.origin":{"file.name":"middleware/log_middleware.go","file.line":60},"message":"request body too large","service.name":"apm-server","url.original":"/intake/v2/events","http.request.method":"POST","user_agent.original":"elasticapm-python/6.3.3","source.address":"*******","http.request.body.bytes":5771,"http.request.id":"4e865219-1484-4f1f-9a44-9807e96102ef","event.duration":1180667,"http.response.status_code":400,"error.message":"event exceeded the permitted size.","ecs.version":"1.6.0"}

The stated size of 5771 bytes does not seem to be large at all (all errors have an value > 5000). I have already increased apm-server.max_event_size to 600000, which had no effect. Either i am changing the wrong property or the error message was much larger than stated in the logs.

Hope someone can help!
Stephan

@stk1

The logged value of http.request.body.bytes is reported per https://cs.opensource.google/go/go/+/refs/tags/go1.18.2:src/net/http/request.go;l=192-199. Which means that we're merely writing the size of the HTTP request as it is set in the request header (by the Python agent).

Most of the Elastic APM agents will use some sort of compression on the request body (the Python agent uses GZIP compression). Compressed bodies tend to be dramatically smaller than when uncompressed.

Elastic APM agents will also send events in batches, which means that the stated body size is for the whole batch, not a single event (Batches are composed of a metadata object + the N number of events that are sent, spans, transactions, metrics, errors).

Since we don't seem to be providing the event type that caused this and the limit affects how many bytes are decoded from the incoming request into memory it's really difficult to provide guidance on the maximum size that should be allowed. From what I can see in our code the configured size is respected across the codebase. You can experiment with doubling that 600K to 1200K. The events that are triggering this are not small.

Another option you may consider is reducing the stack trace depth that your applications are collecting, by default it's set to 50 in the Python agent: Configuration | APM Python Agent Reference [6.x] | Elastic, since that tends to cause the events to be quite large. However, you may lose some debugging ability since the depth will be truncated.

1 Like

Ok, thanks a lot for the answer, I will try and see how increasing the limit to 1200K works!