Cluster State Yellow: 2 shards initializing with multiple failed attempts: IllegalArgumentException [ReleasableBytesStreamOutput cannot hold more than 2GB of data

For about a week, we are seeing the following error and cluster state yellow. On checking the _cluster/state we get this -

Elastic Search Version - 7.17 (Please let me know if more data is needed)

{"state":"INITIALIZING","primary":false,"node":"SU7gTPbgSOuXoIrYteGYWA","relocating_node":null,"shard":2,"index":"","expected_shard_size_in_bytes":31061140722,"recovery_source":{"type":"PEER"},"allocation_id":{"id":"ufdEJ5FqRMqiN09EKM_s6A"},"unassigned_info":{"reason":"ALLOCATION_FAILED","at":"2023-05-22T08:56:04.059Z","failed_attempts":7,"failed_nodes":
...
IllegalArgumentException[ReleasableBytesStreamOutput cannot hold more than 2GB of data]; ]

Full Stack Trace -

IllegalArgumentException[ReleasableBytesStreamOutput cannot hold more than 2GB of data]; ], markAsStale [true]]
org.elasticsearch.transport.SendRequestTransportException: [S**200000*][xx.xx.xx.xx:9300][indices:data/write/bulk[s][r]]
	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:988) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor.sendWithUser(SecurityServerTransportInterceptor.java:206) ~[?:?]
	at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor.access$300(SecurityServerTransportInterceptor.java:53) ~[?:?]
	at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$1.sendRequest(SecurityServerTransportInterceptor.java:167) ~[?:?]
	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:874) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:797) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$ReplicasProxy.performOn(TransportReplicationAction.java:1282) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.support.replication.ReplicationOperation$3.tryAction(ReplicationOperation.java:279) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.support.RetryableAction$1.doRun(RetryableAction.java:99) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:288) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.support.RetryableAction.run(RetryableAction.java:77) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.support.replication.ReplicationOperation.performOnReplica(ReplicationOperation.java:298) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.support.replication.ReplicationOperation.performOnReplicas(ReplicationOperation.java:203) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.support.replication.ReplicationOperation.handlePrimaryResult(ReplicationOperation.java:150) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:136) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.ActionListener.completeWith(ActionListener.java:447) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.bulk.TransportShardBulkAction$2.finishRequest(TransportShardBulkAction.java:233) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.bulk.TransportShardBulkAction$2.doRun(TransportShardBulkAction.java:196) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:245) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:134) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:74) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:196) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:777) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-7.17.5.jar:7.17.5]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_212]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_212]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_212]
Caused by: java.lang.IllegalArgumentException: ReleasableBytesStreamOutput cannot hold more than 2GB of data
	at org.elasticsearch.common.io.stream.BytesStreamOutput.ensureCapacity(BytesStreamOutput.java:175) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.common.io.stream.BytesStreamOutput.writeBytes(BytesStreamOutput.java:85) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.common.io.stream.StreamOutput.write(StreamOutput.java:516) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.common.bytes.BytesArray.writeTo(BytesArray.java:122) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.common.io.stream.StreamOutput.writeBytesReference(StreamOutput.java:189) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.index.IndexRequest.writeBody(IndexRequest.java:747) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.index.IndexRequest.writeThin(IndexRequest.java:728) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.DocWriteRequest.writeDocumentRequestThin(DocWriteRequest.java:266) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.bulk.BulkItemRequest.writeThin(BulkItemRequest.java:115) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.bulk.BulkShardRequest.lambda$writeTo$3(BulkShardRequest.java:72) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.common.io.stream.StreamOutput.writeArray(StreamOutput.java:941) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.bulk.BulkShardRequest.writeTo(BulkShardRequest.java:69) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$ConcreteShardRequest.writeTo(TransportReplicationAction.java:1387) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.action.support.replication.TransportReplicationAction$ConcreteReplicaRequest.writeTo(TransportReplicationAction.java:1459) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.transport.OutboundMessage.serialize(OutboundMessage.java:75) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.transport.OutboundHandler.sendMessage(OutboundHandler.java:180) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.transport.OutboundHandler.sendRequest(OutboundHandler.java:109) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.transport.TcpTransport$NodeChannels.sendRequest(TcpTransport.java:288) ~[elasticsearch-7.17.5.jar:7.17.5]
	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:975) ~[elasticsearch-7.17.5.jar:7.17.5]
	... 28 more

Hi @Sarit_Ghosh,

From the stack trace you have provided it looks like the returned result set is larger than 2G. This is larger than the amount that Elasticsearch can return in a single request. Have you been able to find out the query being run that is triggering this error?

I would recommend having a look at this related thread which does give some suggestions on diagnosing the problem further down.

1 Like

Hi Carly. Thanks for looking into this. We often see cluster state yellow, and during these times if we check the _cluster/state we see this error. This is happening frequently for a specific index where the cluster state is mostly yellow.
Are you suggesting that this could be due to some query ? Could you please suggest few ways we can find out the query ? Or can we restrict the response size by some settings ?
I will go through the link you have given.

Thanks again.

Not quite, this exception is being thrown during indexing and not search. The solution would be to avoid creating such large bulk requests.

2 Likes

Thanks for the clarification @DavidTurner!

1 Like

Thanks for all the responses. I will check the bulk requests happening to the index.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.