Hi, we are currently running AWS OpenSearch and are hitting some hard limits on http.max_content_length. This is a non-configurable 100mb limit. We are using Logstash to process events from Filebeat and hitting an endless loop of 413 errors. I understand from this thread that I can configure the _bulk size however this will impact the input and filter parts of the pipeline. I worry that this may cause processing delays, PQ size increase and eventually Filebeat backoff.
If we were to move to Elastic Cloud in AWS would we be able to configure the http.max_content_length to avoid the 413 error we are seeing? Would the Elasticsearch output be intelligent enough to break up the batches so we didn't hit this endless 413 loop?
I know the actual solution here is to not send huge payloads to Elastic/OpenSearch but we cannot change this due to the business need at present.
OpenSearch/OpenDistro are AWS run products and differ from the original Elasticsearch and Kibana products that Elastic builds and maintains. You may need to contact them directly for further assistance.
(This is an automated response from your friendly Elastic bot. Please report this post if you have any suggestions or concerns )
Neither OpenSearch nor Elasticsearch are really designed to cope with individual docs that exceed 100MiB, so raising the max_content_length will just lead to other problems and hence isn't something you can do. Typically this only matters if you're storing enormous amounts of unsearchable binary data (image-heavy documents or videos) and in that case there's no real need to store it directly in the search engine. Instead, store it in a separate blob store and store only a link to the blob in your search engine.
Regarding whether Logstash can split large binary objects across multiple documents, you'll need to ask the Logstash folks about that. I suggest opening a separate topic in the Logstash forum.