We are using BulkProcessor to insert huge amount of data in a big cluster comprising of 50+ data nodes. Our BulkProcessor setting is set to 4 Concurrent actions with our servers having 80 vcpus.
What we observe is at time ES cluster responds very slowly leading to many NodeNotAvailable Exception if nodes does not respond within 5 seconds.
We have few questions under this scenario:
a. I fired BulkProcessor batch. if I get NodeNotAvailable exception or Remote Transport Exception does it mean any batch which was marked for that node will fail? If it fails, how will i Know? The client does not give me any exception in AfterBulk function.
b. I fired BulkProcessor batch. With the above exception, does node still ingest the batch data or that batch is directed to another node? What i understood is since node is still alive in the cluster it will still wait for node to respond and insert on that node only. Please clarify this.
Does any of these exceptions leads to any data loss or data duplicate in case node starts giving this exception however it is still part of cluster and cluster state is still green.