Repair index after too many open files error

hunsw · August 3, 2020, 11:04am

We had 'too many open files' error on several nodes and even though we fixed the problem, one shard fell into a sort of a limbo.

# Relevant _cluster/allocation/explain output:
node01.example.com = {"in_sync":true,"allocation_id":"xxxx1","store_exception":{"type":"file_system_exception","reason":"/var/lib/elasticsearch/nodes/0/indices/zzzzzzzid/_state: Too many open files"}}
node02.example.com = {"in_sync":false,"allocation_id":"yyyy2"}

The 'usual' POST /_cluster/reroute?retry_failed=true API call does not allocate the shard and although the actual data -- more or less -- can be found on node01 and node02 too, it seems to differ (i.e. Lucene data, translog files, checkpoints have different file size and/or modification time).

Is there a way to force a resync without losing data? (Or w/losing as little as possible...)

hunsw · August 4, 2020, 5:41am

A node restart (of node01) resolved the problem. It looks like the allocator was stuck because of the 'too many open files' error.

system · September 1, 2020, 5:41am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Error about too many open files when allocating shard Elasticsearch	6	1037	May 14, 2018
Stuck with too many open files issue Elasticsearch	3	3862	October 23, 2019
Too many opened files Elasticsearch	15	7131	May 2, 2017
Elasticsearch - Too many open files Elasticsearch	4	15419	July 5, 2017
Too many open files with many concurrent index requests Elasticsearch	5	1159	July 6, 2017

Repair index after too many open files error

Related topics