Es highlevel client do bulk operations java.lang. IllegalStateException happened

ref url in github:

env:
CentOS7, Java 8, SpringBoot, Maven, ES highlevel jar
Elasticsearch 6.7 which was running in docker on OSS service

background:
this is a data handler node which need to do some scheduled data sync jobs, which i will load some data from mysql and do some data handlings then bulk insert into target index in ES

issue:
I was build a single singleton es client with the application starting

try {
String template = "%s:%s";
HttpHost[] httpHosts = Arrays.stream(ES_ADDRESSES.split(",")).parallel().map(x -> {
esHttpAddress.add(String.format(template, x, ES_HTTP_PORT));
return new HttpHost(x, ES_HTTP_PORT, "http");
}).toArray(HttpHost[]::new);

RestClientBuilder builder = RestClient.builder(httpHosts)
                .setRequestConfigCallback((RequestConfig.Builder requestConfigBuilder) ->
                        requestConfigBuilder.setConnectTimeout(ES_CONNECT_TIMEOUT)
                                .setSocketTimeout(ES_SOCKET_TIMEOUT)
                                .setConnectionRequestTimeout(ES_CONNECTION_REQUEST_TIMEOUT))
                .setMaxRetryTimeoutMillis(ES_MAX_RETRY_TINEOUT_MILLIS);

        restHighLevelClient = new RestHighLevelClient(builder);

        bulkProcessor = BulkProcessor.builder((request, bulkListener) -> 
                        restHighLevelClient.bulkAsync(request, COMMON_OPTIONS, bulkListener),
                    getBPListener())
                .setBulkActions(ES_BULK_FLUSH)
                .setBulkSize(new ByteSizeValue(ES_BULK_SIZE, ByteSizeUnit.MB))
                .setFlushInterval(TimeValue.timeValueSeconds(10L))
                .setConcurrentRequests(ES_BULK_CONCURRENT)
                .setBackoffPolicy(BackoffPolicy.constantBackoff(TimeValue.timeValueSeconds(1L), 3))
                .build();
    } catch (Exception e) {
        e.printStackTrace();
        String errMsg = "Error happened when we init ES transport client" + e;
        log.error(errMsg);
        throw new MainServiceException(ResultEnum.ES_CLIENT_INIT);
    }

and then, every data saving operations will be done by
bulkProcessor.add((DocWriteRequest) request);

but when i start the data sync jobs, java.lang.IllegalStateException happens which announced that the exception happened for I/O STOPPED

does it means that the bulkProcessor or restHighLevelClient has an up-to limit on requests, and it will close the connection when the requests' number is too much here?

Please format your code, logs or configuration files using </> icon as explained in this guide and not the citation button. It will make your post more readable.

Or use markdown style like:

```
CODE
```

This is the icon to use if you are not using markdown format:

There's a live preview panel for exactly this reasons.

Lots of people read these forums, and many of them will simply skip over a post that is difficult to read, because it's just too large an investment of their time to try and follow a wall of badly formatted text.
If your goal is to get an answer to your questions, it's in your interest to make it as easy to read and understand as possible.
Please update your post.

Also what are the values of all the constants?

cheers dadoonet, i just fix the format of my comment.

sorry for that

Also what are the values of all the constants?

the related constants ref:

ES_CONNECT_TIMEOUT 5000
ES_SOCKET_TIMEOUT 400000
ES_CONNECTION_REQUEST_TIMEOUT 1000
ES_MAX_RETRY_TINEOUT_MILLIS 600000
ES_BULK_SIZE  5000
ES_BULK_FLUSH 5000
ES_BULK_CONCURRENT 3

the ports of ES are default as 9200/9300

Am I reading this correctly in that you have configured the max bulk size to be 5000 MB, which is greater than the 100 MB limit imposed by the server by default?? The recommended max bulk size is a few MB, so I would recommend changing that (if I read that correctly).

1 Like

yeah, u r correct, i set the bulk size as 5000MB for now, but for the bulk submit, there will be another effective parameter as well
.setFlushInterval(TimeValue.timeValueSeconds(10L))
I thought the flush time range will make the bulk submitted in every 10 seconds which the data my service added will not able to reach to 5000MB.

And I will decrease the bulk size as well then, thanks bro

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.