I use BulkProcessor to index records. I see that some of the records are missing when doing bulk indexing. I don't see any Exception or Errors, also, in afterBulk callback there are no failures. BulkProcessor is singleton spring bean so, I don't explicitly call awaitAndClose or flush. It would be of great help if somebody helps in guiding me to resolve the issue.
this.bulkProcessor = BulkProcessor.builder(
(request, bulkListener) -> esClient.bulkAsync(request, RequestOptions.DEFAULT, bulkListener),
new BulkProcessor.Listener() { @Override
public void beforeBulk(long executionId, BulkRequest request) {
logger.debug("going to execute bulk of {} requests", request.numberOfActions());
}
Our's is a continuous running application, which keeps indexing data as and when some event occurs. We will not be explicitly stopping the application, unless there is a deployment needed. Also, I call close on bulkprocessor when spring bean is about to be destroyed by the spring container.
One of the observation is if the application is running only one node then there are no missing records, but If I have my application running on multiple nodes, then I see that the records are missing.
yes I am sure, earlier I used highLevelRestClient.index() this used to work fine in the cluster, since it was too slow I moved to BulkProcessor. BulkProcessor is too fast, but now this problem of missing records.
Do I have to check anything with respect to write ThreadPools in Elasticsearch server. The problem is I donot see any rejections or errors from either elasticsearch server or my application.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.