BulkProcessor API not indexing data to ES

Hi,

I am using BulkProcessor to ingest data to elasticsearch. My index is getting created but I could not see any data ingested to that index.
I added logs into beforeBulk and afterBulk listeners logs are getting printed but data is not getting ingested.

This is the code I am using

bulkProcessor.add(new IndexRequest(INDEX_NAME, type,  id).source(
							sourceJson,
							XContentType.JSON));

bulkProcessor.awaitClose(5, TimeUnit.SECONDS);

Can you share more code?

This is the way I am creating bulkprocessor
BulkProcessor
.builder(ESConnectionFactory.newConnection(),
new BulkProcessor.Listener() {

							public void beforeBulk(long executionId,
									BulkRequest request) {
								logger.info("Before bulk");

							}

							public void afterBulk(long executionId,
									BulkRequest request, Throwable failure) {

							}
							
							public void afterBulk(long executionId,
									BulkRequest request, BulkResponse response) {
								logger.info("After bulk");

							}

							
						})
				.setBulkActions(1)
				.setFlushInterval(TimeValue.timeValueSeconds(5))
				.setConcurrentRequests(1)
				.setBackoffPolicy(
						BackoffPolicy.exponentialBackoff(
								TimeValue.timeValueMillis(100), 3)).build();

My mistake JSON was having some problems so data was not getting indexed.

One more thing I want to ask related to this is
Previously I used RestCLient and there I was appending following line
String actionMetaData = String.format("{ \"index\" : { \"_index\" : \"%s\", \"_type\" : \"%s\" } }%n", index, type);

This way I was creating my json for almost 500+ docs and then creating entity object and ingesting data to elasticsearch through RestClient.performRequest.

This operation was very quick. But now when I use transport client and try bulkprocessing it is taking so much time. Is there any way to do it similar to RestClient where I can give whole list of document in a single request as it is without iterating it.

Not sure I understand the whole thing but doing a Bulk request over REST or a Bulk request over TRANSPORT should behave in a very similar way.

If you see noticeable difference and that TRANSPORT is slower than REST, then I suspect you are doing something wrong or that you are not testing exactly with the same elasticsearch version.

@dadoonet Let me explain you about the data. I am having a json file which contains an array of elements. Array size is around 300/400. What I am doing is I am iterating over this array and taking single element as a one document adding that document to bulkProcessor. So what happens is it iterates over 300 elements adds to bulkProcessor one by one. This takes time. This is a single file containing 300 elements but I will be having so many files like this. So inserting let say 20 file's data takes almost 5mins or something. May be I will be doing it is some wrong way. Previously when I used RestClient I used to append the line which i have mentioned above to each document and just inserting all the element's array so that was a one time request for 300 docs and it was doing it very quickly so inserting data for 20 files was not taking more than a minute. Is there anything similar for transport client or bulkProcessor where I can give whole array of 300 elements at one go?

Why did you set setBulkActions(1). It's going to make it pretty much slow as you will basically send one document by one instead of sending a batch of 1000 or 10000 docs at once.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.