File Descriptors leak on upgrade to 8.17.3

Hello,

I've tried searching around the forum for potential similar issues but couldn't find anything related so apologies in advance if this issue has been raised before.

I've completed an upgrade of our Elasticsearch cluster from version 8.10.2 to 8.17.3 and started receiving alerts for increased number of file descriptors being used.

When looking deeper into the issue we noticed that all these files appear to be deleted .dvd files that no longer exist but are still open.

when running lsof -a -p 2356841 -d ^mem | wc -l we get the following:

COMMAND     PID     USER   FD      TYPE             DEVICE   SIZE/OFF       NODE NAME
java    2349214     1000  cwd       DIR               0,45       4096     539905 /usr/share/elasticsearch
java    2349214     1000  rtd       DIR               0,45       4096     524308 /
java    2349214     1000  txt       REG               0,45      12328     538322 /usr/share/elasticsearch/jdk/bin/java
java    2349214     1000  DEL       REG                9,0             188875029 /usr/share/elasticsearch/data/indices/fu2Qku6HQd-7hidVDZrFXw/2/index/_59vj_22qk_Lucene90_0.dvd
java    2349214     1000  DEL       REG                9,0             166462790 /usr/share/elasticsearch/data/indices/fmt6yzMqQyeAMyAq8PLLLw/1/index/_bjmj_3d4_Lucene90_0.dvd
java    2349214     1000  DEL       REG                9,0             130810117 /usr/share/elasticsearch/data/indices/QOyV3F2ZT4Wrkw2txuSO5A/0/index/_2ecro_6ft_Lucene90_0.dvd
java    2349214     1000  DEL       REG                9,0             130810172 /usr/share/elasticsearch/data/indices/QOyV3F2ZT4Wrkw2txuSO5A/0/index/_2ecro_6fs_Lucene90_0.dvd
java    2349214     1000  DEL       REG                9,0             130809928 /usr/share/elasticsearch/data/indices/QOyV3F2ZT4Wrkw2txuSO5A/0/index/_2ecro_6fr_Lucene90_0.dvd
...
24400

We basically get an endless list of DEL FDs that appear to be files that have been deleted but are still open.

Has anyone ran into a similar issue? Anything I might be overlooking here?

Best regards,
Ricardo Ferreira

UPDATE:

After some hours the number of open files eventually started dropping and stabilized.

The image shows the node_filefd_allocated metric from node-exporter measured in all data nodes of the cluster.

After the upgrade we see a big increase, after a few hours we decided to restart the nodes one by one after we increased the fs.file-max. The growing number of open files eventually subsided and stabilized around 80,000.