Unable to delete documents from full index

Hi;

I have an application with one build-in Elasticsearch node used for collecting log events into 3 indices. The application is deployed at several sites.

At some point in the future the implementation is going to be changed to data streams with polices, but for the moment the oldest documents are delete with a delete by query request executed by a Tomcat job.

However, in one specific deployment an index has reached the Lucene limit for number of documents in an index and the error

Number of documents in the index can't exceed [2147483519

is logged by Logstash.

I have tried with a script to delete the oldest documents with delete by query requests, but this does not work when the index is full. The delete by query request is just ignored and no documents are deleted. For the other indices which are not full the script works as expected.

I now want to try to re-index the oldest documents into a new index and then delete them.

My question is if the is possible to re-index, when the index is full?

Best regards
@fgjensen

look at

What version of elasticsearch?

Please show the get cat indices for that index

GET _cat/indices/indexname?v

Hello @fgjensen

Looking at the blogs seems your version must be 7.3.2 , could you please confirm?

I believe split API can be used as per the below link & i have tried but for smaller index :


GET kibana_logs_success/_count #14074

PUT /kibana_logs_success/_block/write

POST /kibana_logs_success/_split/kibana_sample_logs_split
{
  "settings": {
    "index.number_of_shards": 10
  }
}

GET kibana_sample_logs_split/_count

I used 10 primary shards you can consider a lower number :

Thanks!!

Hi

The version of Elasticsearch is 8.18.

We are maintaining the application but has not changed this part of the application yet except from upgradering to the latest version of Logstash and Elasticsearch. The application does not use Kibana.

The delete_by_query is similar to the one stated above except from I delete 100.000 documents per batch. When an index is not full it takes around 100 seconds per batch.

There are only one Elasticsearch node and one shard per index.

BR Flemming

@fgjensen

Please show the cat indices.... There is a reason I'm asking for that.

GET _cat/indices/indexname?v

1 Like

Hi @stephenb

There you have the output from _cat. I do not have access to the server on which Elasticsearch runs, so I have to do all work by exchanging scripts and documents.

It is part of the script which performs the delete by query post to perform a _cat indices request.

health status index                 uuid                   pri rep docs.count docs.deleted store.size pri.store.size dataset.size
yellow open   apclient_log_entry_v1 TL7gZYUfSGG1mzQqwPpcjQ   1   1     520413        64601     182025         182025       182025
yellow open   audit_entry_v1        uwuExeL1TXS5Y7tDliMbbg   1   1    1240849       134926     675169         675169       675169
yellow open   rest_log_entry        cNSInhLPT9KFelhbBHxNmA   1   1 2147483519            0  240451302      240451302    240451302
yellow open   rest_log_entry_v1     hrkgT7-kQIq4UuD7LR578w   1   1          0            0          0              0            0

Thank's for your help.

@fgjensen

Thanks I was hoping that there was still deleted docs in the index then you could expunge them in. The total number would come down but.... Unfortunately, you've already hit the limit and there are no deleted docs left in me index.

If you just run a query with no delete does it find documents?

as @Tortoise suggested

I think you're going to need split the index into more than 1 shard to work with it.. The limit is actually at the shard level and since you only have one primary shard that's why you are limited.

If you split it into several shards you should be able to go back to working with it.

Need to think about a longer-term strategy about these indices more than one shard otherwise you're going to keep running into this limit

Hi @stephenb

The business rule that I am trying to achieve on short term is to be able to insert new documents in the rest_log_entry index by deleting the oldest documents. No index must contain documents older than 13 months. Since the two first indices are not full it works to delete the oldest documents with both the script and the Tomcat job (when the Tomcat job are deployed to this server).

The query the script runs for the moment is to find the oldest and youngest documents in each index.

Since I cannot delete from the full index my plan is to reindex the oldest documents into at second index. I'll take a look into splitting the index into several shards if it works for a full index?

On the term I'll change the to datastreams and index policies in order for Elasticsearch to take care of deleting the oldest documents.

Not sure exactly what that solves since the old docs will still be in you main index. But you seem to have a handle on your problem.

It will. Then you should be able to delete.

Yes this will solve many issues for you....

Good luck, let us know how it goes.

Hi @stephenb

Thanks for a swift answer.

I understand splitting is the shortest way to be able to delete the oldest documents, so I will try that solution.

Thanks to all who have posted to this question.

BR @fgjensen

How much of the data is "older than 13 months" ? 1% 10% 95% ?

Splitting the index will write all the documents out to N new shards

re-indexing can select only the newer documents, also writing to a new index with however many shards you choose.

--> depending on the numbers re-index-ing-only-what-you need might be quicker.

(don't forget to check disk space availability)

I am going to take a mad guess and say..... 1/13th of the data :slight_smile:

But yes worth checking...

Actually, scratch that, the split API will be quicker in vast majority of cases, as it should just be hard linking a bunch of files.

The deleting of 1/13th (or however much there is) of data will take longer, but in that time the (new) index is usable.

The number of documents in each index is growing per month , since the application load is increasing. But that a side it says in split documentation:

* Before you can split an index:
* The index must be read-only.
* The cluster health status must be green

We can handle the read-only requirement, but this is a single node cluster so it will never turn green

Set Replicas to 0 and cluster will be green
Having replicas set above 0 on a single node will always result in a yellow cluster

A single node cluster can be green. I have one right here on there Mac I'm using right now.

{
  "cluster_name": "elasticsearch",
  "status": "green",
  "timed_out": false,
  "number_of_nodes": 1,
  "number_of_data_nodes": 1,
  "active_primary_shards": 316,
  "active_shards": 316,
  "relocating_shards": 0,
  "initializing_shards": 0,
  "unassigned_shards": 0,
  "unassigned_primary_shards": 0,
  "delayed_unassigned_shards": 0,
  "number_of_pending_tasks": 0,
  "number_of_in_flight_fetch": 0,
  "task_max_waiting_in_queue_millis": 0,
  "active_shards_percent_as_number": 100.0
}

Possibly you have indices with number_of_replicas set to 1, and nowhere to put them? Set to zero?

1 Like

You can also update the index setting "index.merge.policy.deletes_pct_allowed" to below 20%
This will auto manange the retention of deleted docs during merge operations