Sequence numbers & indexing bottleneck

We are trying to debug a problem where indexing a document halts the ingestion pipeline on that particular shard

Taking into account the way sequence numbers are implemented, do documents get processed sequentially? I don't think so since ES has a write threadpool. At what point does the indexing fan out to leverage parallelism within a shard? What I'm trying to understand is, can a document A which is undergoing indexing and has been assigned with seqno=1 blocks indexing of document B with seqno=2 if it's taking time to index document A ? if not, then B with seqno=2 can be indexed before A gets indexed (and likewise visible in queries)? in that case what happens if A fails, what happens to the seqno=1? There has to be some sequential processing going on at some point right?

No, once the seqno is assigned the indexing happens in parallel. Documents don't appear in seqno order indeed.

It's turned into a no-op to mark that seqno as complete.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.