Elasticsearch documents getting deleted

ItsHoney · September 10, 2025, 7:53pm

We’ve been experiencing an issue where certain documents in our Media index go missing. What’s unusual is that it’s often the same documents that disappear each time.

Here’s what we’ve observed:

We have a process that deletes older media when new media is indexed.
Initially, we suspected that deletes might be happening after new documents were indexed, but the behavior is inconsistent.
- In some cases, documents disappear within seconds of indexing.
- In other cases, the same documents stay for weeks before disappearing.
Because of this, we’re not sure if the issue is tied to our delete logic or something else.

We also tried enabling audit logs to investigate further, but we’re struggling with filtering. Specifically:

We delete documents based on an ExpandKey field.
We’d like the cluster to log only delete events where ExpandKey starts with a certain prefix, instead of logging all delete operations (which creates a lot of noise).

We’d really appreciate guidance on:

Best practices for tracking down unexpected deletions in Elasticsearch/OpenSearch.
How to configure audit logging (or another mechanism) to capture only the deletes that match certain field criteria.

P.S I have realized that we are using routing when indexing documents, but not using routing when deleting them. I will try to rectify this, but can this be a cause of this?

system · September 10, 2025, 7:53pm

OpenSearch/OpenDistro are AWS run products and differ from the original Elasticsearch and Kibana products that Elastic builds and maintains. You may need to contact them directly for further assistance. See What is OpenSearch and the OpenSearch Dashboard? | Elastic for more details.

(This is an automated response from your friendly Elastic bot. Please report this post if you have any suggestions or concerns )

Christian_Dahlqvist · September 11, 2025, 2:04pm

Which product are you using? It is quite possible that the answer depends on this.

It sounds like you might be using delete by query to remocve old data. It would be useful to have more detail about exactly how your deletion process works.

This will depend a lot on the product and version used, which you have not specified.

That depends on how you delete data.

ItsHoney · September 11, 2025, 2:16pm

I am using Managed OpenSearch service from AWS.
Yes I am using delete by query.

So for our media index, whenever we have an update for our Media, we first delete the previously existing Media documents using an ExpandKey.

Here’s an example,

/media_1_7/_delete_by_query {"query":{"query_string":{"query":"Metadata.ExpandKey.keyword: 13711637256"}}}

The ExpandKey is a combination of {datasourceID}{listingID}. so {1371}{1637256}

For different data sources, we can have the same ListingID. But this combination should be unique in the whole system.

So before indexing the new media, we have code to delete older media like so:

        indexRecords = indexRecords.Where(doc => doc.DoIndex).ToList();
        // run a delete before indexing the documents
        var deleteResponses = await indexRecords
            .Where(x => x.Indexing.DeleteQuery != null)
            .Select(_esRepo.DeleteDocumentAsync)
            .WhenAll()
            .ConfigureAwait(false);
        foreach (var error in deleteResponses.Where(x => x.OriginalException != null))
        {
// log errors
        }

        // now index all the records
        var bulkResponse = await _esRepo.IndexDocumentsAsync(indexRecords);

I am using OpenSearch version OpenSearch_2_11_R20250630

I hope it answers your questions!

Christian_Dahlqvist · September 11, 2025, 2:49pm

OpenSearch is a different product from Elasticsearch so I would recommend you reach out to the OpenSearch community or AWS support. Their implementation of security and audit logging is completely different from Elasticsearch and I do not know whether there are any special limitations or peculiarities related to their managed service.

Are the fields fixed length or is it possible {1371}{1637256} could exist at the same time as e.g. {137}{11637256}?

ItsHoney · September 11, 2025, 6:12pm

I think you might’ve caught the problem! I am so shocked I didn’t consider it xD

Thanks alot! I think the problem has mostly been resolved

Topic		Replies	Views
Detecting deletes Elasticsearch elastic-stack-security	4	1826	December 15, 2020
Avoiding delete by query to write in logs Elasticsearch	7	703	February 6, 2017
Elasticsearch documents deleted automatically Elasticsearch	12	5476	July 5, 2017
Elastic Search deleting some files while indexing? Elasticsearch	6	385	July 6, 2017
Automatically deleting logs Elasticsearch	5	465	July 6, 2017

Elasticsearch documents getting deleted

Related topics