Bulk Ingester flush timing issues with wait_for Refresh Policy

Hello,

I’m working with the bulk ingester in Elastic 8.15.3 and encountering some timing issues with flush. I configured the bulk ingester with a refresh policy set to wait_for in the global settings, expecting that flush would ensure data availability immediately after it's called. However, flush doesn’t seem to wait for indexing completion as anticipated.

In reviewing the code, I noticed that the close method behaves differently: it calls flush and waits for it to finish, whereas flush alone doesn’t seem to block until the operation is fully processed. Could anyone confirm if this behavior is expected or if there’s a way to enforce synchronous waiting with flush?

Thanks in advance!

1 Like

We are currently testing two possible approaches. Our primary requirement is to ensure that the searches performed by a client are consistent with their previous write operations. To achieve this, we are relying on the 'wait_for' option.

Previously, we used the BulkRequest API, but since we want to migrate away from the Elasticsearch 7 Java API, we started exploring the use of BulkIngester. Initially, we created one BulkIngester per request, as a single request can trigger the indexing of multiple documents.

Our first question: does this approach introduce too much overhead?

The second approach we're experimenting with, as suggested by Abadi, is using a single BulkIngester shared among multiple requests. However, we’re facing challenges in ensuring that the writes are available for querying.

Has anyone encountered a similar need? How do you generally handle this type of requirement?