Hi all! I’m curious to learn about how the process of document update (and upsert/partial upsert) works under the hood.
I know that Lucene segments are immutable and that “deleting” a doc is a soft delete by way of tombstone marker. But how is a prior doc with the same doc ID found in order to set its tombstone bit? Upon ingestion of a document with a given ID, is an actual query issued to the cluster to find the doc with that ID? Or is there an additional data structure of some kind used to help keep these kinds of lookups even faster?
I’m keen to gain a better understanding! Thanks!