ElasticSearch Java Client 8.1.2 - Entity content is too large for the configured buffer limit

Hi everyone - I'm trying to index a large amount of data into my Elasticsearch 8.1 Docker container. I've already changed the setting http.max_content_length in the Elasticsearch.yml file so I no longer encounter the 413 error (Bulk entity too large) and provided enough resources where I don't encounter the 429 error (Too many requests). However I'm unable to move forward when I encounter this error when I pass an ArrayList of POJOs into the ElasticsearchClient's bulk() function: entity content is too long [257579808] for the configured buffer limit [104857600]

The previous fix for this issue [in ES 7] required configuring the HttpAsyncResponseConsumerFactory with a new buffer size but the RestClient's performRequest function seems to have changed and that fix appears no longer available.

Is there a way to resolve this issue using the Java Elasticsearch Client 8.1?

Here's my code to create an instance of the ElasticsearchClient..

    private ElasticsearchClient getElasticSearchClient() throws Exception {
        Path caCertificatePath = Paths.get(env.getProperty("ca_crt"));
        CertificateFactory certFactory =
                CertificateFactory.getInstance("X.509");
        Certificate trustedCa;
        InputStream is = Files.newInputStream(caCertificatePath);
        trustedCa = certFactory.generateCertificate(is);
        KeyStore trustStore = KeyStore.getInstance("pkcs12");
        trustStore.load(null, null);
        trustStore.setCertificateEntry("ca", trustedCa);
        SSLContextBuilder sslContextBuilder = SSLContexts.custom()
                .loadTrustMaterial(trustStore, null);
        final SSLContext sslContext = sslContextBuilder.build();

        // Create low-level client
        CredentialsProvider credentialsProvider =
                new BasicCredentialsProvider();
        credentialsProvider.setCredentials(AuthScope.ANY,
                new UsernamePasswordCredentials(env.getProperty("es_user"), env.getProperty("es_password")));

        RestClientBuilder rcBuilder = RestClient.builder(
                        new HttpHost(env.getProperty("es_server"), Integer.parseInt(env.getProperty("es_port")), "https"))
                .setRequestConfigCallback(
                new RestClientBuilder.RequestConfigCallback() {
                    @Override
                    public RequestConfig.Builder customizeRequestConfig(
                            RequestConfig.Builder requestConfigBuilder) {
                        return requestConfigBuilder
                                .setConnectTimeout(10000)
                                .setSocketTimeout(120000);
                    }
                })
                .setHttpClientConfigCallback(new RestClientBuilder.HttpClientConfigCallback() {
                    @Override
                    public HttpAsyncClientBuilder customizeHttpClient(
                            HttpAsyncClientBuilder httpClientBuilder) {
                        return httpClientBuilder.setSSLContext(sslContext)
                                .setDefaultCredentialsProvider(credentialsProvider);
                    }
                });

        RestClient restClient = rcBuilder.build();
        // Create the transport with a Jackson mapper
        ElasticsearchTransport transport = new RestClientTransport(
                restClient, new JacksonJsonpMapper());
        return new ElasticsearchClient(transport);
    }

Here's my code trying to index the data into Elasticsearch:

    private void saveLogDataToES(ArrayList<LogData> data) throws Exception {
        // Connect to ElasticSearch server and obtain access to client API
        ElasticsearchClient client = getElasticSearchClient();
        BulkRequest.Builder br = new BulkRequest.Builder();
        // Convert my POJOs to ElasticSearch-compatible POJOs
        ArrayList<ESLogData> esDataList = convertToESLogData(data);

        for (ESLogData esdata : esDataList) {
            br.operations(op -> op
                .index(idx -> idx
                    .index("my_index")
                    .id(esdata.getId().toString())
                    .document(esdata)
                )
            );
        }

        BulkResponse result = null;
        try {
            result = client.bulk(br.build());
        } catch (Exception ex) {
            log.error(ex.getMessage());
            throw ex;
        }
    }

Why have you increased the request limit? It seems you are indexing log data, so I am surprised you would have log lines that require such large bulk requests. The bulk interface is for indexing multiple documents at once as this reduces overhead and the ideal size is often stated to be a few MB in size, and thus below the default limit. If you are parsing a large log you would extract lines into documemnts and send these to Elasticsearch in multiple requests.

Hi Christian...thanks for replying! You are correct I'm attempting to index log data. In this instance, I'm attempting to index a 300 MB log file. I was hoping to index everything at once and wasn't aware that the ideal size was just a few MB in size.

I'll follow your advice and try sending multiple bulk requests rather than one large one.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.