Ulimits for elasticsearch instances

Hi
I'm facing with issue related to configuration with resource limits.
On the host I've 18 nodes with Elasticsearch (build over docker-compose)
I have taken the following into consideration setting from elastic doc (ref. to ulimits)

my settings on docker-compose:

ulimits:

      memlock:

        soft: -1

        hard: -1

      nofile:

        soft: 65535

        hard: 65535

my settings on OS:

cat /etc/security/limits.conf

#*               soft    core            0
#*               hard    rss             10000
#@student        hard    nproc           20
#@faculty        soft    nproc           20
#@faculty        hard    nproc           50
#ftp             hard    nproc           0
#@student        -       maxlogins       4
#* soft memlock unlimited
#* hard memlock unlimited
#* hard rss unlimited
#* hard stack unlimited
#* soft nproc 40960
#* hard nproc 40960
#* - nofile 655360



* soft nofile 1024000
* hard nofile 1024000
* soft memlock unlimited
* hard memlock unlimited
elasticsearch   soft     nofile          65535
elasticsearch   hard     nofile          65535
elasticsearch   soft     memlock         unlimited
elasticsearch   hard     memlock         unlimited
root soft nofile 1024000
root hard nofile 1024000
root soft memlock unlimited

but the main problem is fact that these instances of Elasticsearch have saturated OS.

lsof | grep elasticsearch | wc -l
10728110

So Elasticsearch needed

elasticsearch   soft     nofile          65535
elasticsearch   hard     nofile          65535

but per node? So this number should be multiplied x18 in my case?
And at least I couldn't reach that host over ssh.

lsof massively overcounts the number of open file descriptors in multi-threaded processes. Use GET /_nodes/stats/process?filter_path=**.open_file_descriptors (or ls /proc/$ES_PID/fd | wc -l if you don't believe the stats API).

after enable traffic I couldn't log into Elasticsearch user into OS
something has been saturated enviroment

{
  "nodes" : {
    "BN6IoJI7SOuKOTmeYmjWJw" : {
      "process" : {
        "open_file_descriptors" : 1809
      }
    },
    "eT_c4mMDQo25o2THN2E6wg" : {
      "process" : {
        "open_file_descriptors" : 1796
      }
    },
    "7Ecx7eqAQUyVs0HJFUuI3A" : {
      "process" : {
        "open_file_descriptors" : 1643
      }
    },
    "3tI3P3iNTtWAbrmsBNIgzg" : {
      "process" : {
        "open_file_descriptors" : 1696
      }
    },
    "Er8iHbSLQkS4mcX6uq1mjA" : {
      "process" : {
        "open_file_descriptors" : 1791
      }
    },
    "iKnzU7anRp-9SCMw_6qq6g" : {
      "process" : {
        "open_file_descriptors" : 1812
      }
    },
    "ELZAP-hRTVyVZhO_CEar5w" : {
      "process" : {
        "open_file_descriptors" : 1801
      }
    },
    "I9QvicnmSiundLb2xfdakA" : {
      "process" : {
        "open_file_descriptors" : 1827
      }
    },
    "m8nsz3k9RIWSSdnWI8lPUg" : {
      "process" : {
        "open_file_descriptors" : 1807
      }
    },
    "3MaklgLxTHqy_w9RqOU_Hg" : {
      "process" : {
        "open_file_descriptors" : 1696
      }
    },
    "57jMdmTsQW-xLJY-IJvXTg" : {
      "process" : {
        "open_file_descriptors" : 1781
      }
    },
    "3Pj_uzQ2QjeHXz1HNPaiZw" : {
      "process" : {
        "open_file_descriptors" : 1787
      }
    },
    "9LiaFFeySAu1LqWK9sNCcg" : {
      "process" : {
        "open_file_descriptors" : 1836
      }
    },
    "u-fZoaISRtyD62M-XvoLOQ" : {
      "process" : {
        "open_file_descriptors" : 1812
      }
    },
    "oHE_JroUQZeSCuuDO9IxBQ" : {
      "process" : {
        "open_file_descriptors" : 1795
      }
    },
    "56QNouQoTpmQ-RGQZuLLtA" : {
      "process" : {
        "open_file_descriptors" : 1643
      }
    },
    "K15W_gVpRharFt4p6RzDBw" : {
      "process" : {
        "open_file_descriptors" : 1802
      }
    },
    "wa_e7cBJQHG_nhQdqxOzyA" : {
      "process" : {
        "open_file_descriptors" : 1696
      }
    },
    "wOMZvpx7QkW1IoDB9-hv5A" : {
      "process" : {
        "open_file_descriptors" : 1897
      }
    },
    "L2j7mumCQ8C802fxQUX5QA" : {
      "process" : {
        "open_file_descriptors" : 1696
      }
    },
    "vcjqq1ooTdGIzgeqO6wzGg" : {
      "process" : {
        "open_file_descriptors" : 1560
      }
    },
    "kP75VSSgT2KemScs6VBiUw" : {
      "process" : {
        "open_file_descriptors" : 1560
      }
    },
    "B7CW9g89RfO8QnMUZ1TIOQ" : {
      "process" : {
        "open_file_descriptors" : 1643
      }
    },
    "bXlYdDzOTj6shjCboPFstA" : {
      "process" : {
        "open_file_descriptors" : 1866
      }
    },
    "WtMS3evGTEm0_3s4FhPvZw" : {
      "process" : {
        "open_file_descriptors" : 1812
      }
    },
    "k2JmvFBAQROh3u0zT3yNiA" : {
      "process" : {
        "open_file_descriptors" : 1793
      }
    },
    "E7Hwq6HOT0-yVyD25p7iEA" : {
      "process" : {
        "open_file_descriptors" : 1816
      }
    },
    "_s4AllbHTTC_0suYPh03TQ" : {
      "process" : {
        "open_file_descriptors" : 1798
      }
    },
    "2pdjAuBrTyyFW8F4hrJRAQ" : {
      "process" : {
        "open_file_descriptors" : 1789
      }
    },
    "YO9wrnWKSkiypeVUsL3JuQ" : {
      "process" : {
        "open_file_descriptors" : 1807
      }
    },
    "cM7Cwc2aTLG009dFBR0nWg" : {
      "process" : {
        "open_file_descriptors" : 1778
      }
    },
    "xey4vUs0TnubCrFm9VBwrg" : {
      "process" : {
        "open_file_descriptors" : 1800
      }
    },
    "KioyT72kS9u-Necw5C_Llg" : {
      "process" : {
        "open_file_descriptors" : 1696
      }
    },
    "F6_BfiQCRyavf68-NvVfnA" : {
      "process" : {
        "open_file_descriptors" : 1643
      }
    },
    "6qyshpyBQbuYBD2YaAK_Bw" : {
      "process" : {
        "open_file_descriptors" : 1806
      }
    },
    "ACrIy__JTEqjcA5lXd17tQ" : {
      "process" : {
        "open_file_descriptors" : 1808
      }
    },
    "3G7XLBzaTlWrrx4iOcNqJw" : {
      "process" : {
        "open_file_descriptors" : 1643
      }
    },
    "5LgGtmNpRVmoGRWcc7OvCA" : {
      "process" : {
        "open_file_descriptors" : 1696
      }
    },
    "VGhy6V5LTaWoQpFbrFPz_g" : {
      "process" : {
        "open_file_descriptors" : 1809
      }
    },
    "1X7Ml2PRS7eke-8Zctd16g" : {
      "process" : {
        "open_file_descriptors" : 1770
      }
    },
    "oXmAk4FGQkikBzTHFS4aRw" : {
      "process" : {
        "open_file_descriptors" : 1696
      }
    },
    "nFG7Xy82RaitFbHQus4vQA" : {
      "process" : {
        "open_file_descriptors" : 1815
      }
    },
    "ygKlsWVKRRSJ00WiGUesQw" : {
      "process" : {
        "open_file_descriptors" : 1696
      }
    },
    "vbRvlx1LT5m99Ua6UP4GxA" : {
      "process" : {
        "open_file_descriptors" : 1781
      }
    },
    "oZ0-kSNdR_WA9_y-cfNrSw" : {
      "process" : {
        "open_file_descriptors" : 1560
      }
    },
    "Zqtl-JhEQP2ZVprNh33DcQ" : {
      "process" : {
        "open_file_descriptors" : 1696
      }
    },
    "opSKB-jzQMSEY2SmiYR2eQ" : {
      "process" : {
        "open_file_descriptors" : 1795
      }
    },
    "0kyZE-9EQB2RWi1tsAyRNA" : {
      "process" : {
        "open_file_descriptors" : 1798
      }
    },
    "-LMNosU-Rm2TkOcE6gZetg" : {
      "process" : {
        "open_file_descriptors" : 1643
      }
    },
    "1g395ltZSAif9NWU2i5OeQ" : {
      "process" : {
        "open_file_descriptors" : 1817
      }
    },
    "kzjpEz4tSKChzMNPhM2y5A" : {
      "process" : {
        "open_file_descriptors" : 1785
      }
    },
    "PMJwhWXiTre7rP4zyp2VRw" : {
      "process" : {
        "open_file_descriptors" : 1794
      }
    },
    "vCZH6-pkTbmjUKL7LHEvDw" : {
      "process" : {
        "open_file_descriptors" : 1797
      }
    },
    "Xx0xvsDkRYKfGgc1mz7iXA" : {
      "process" : {
        "open_file_descriptors" : 1785
      }
    }
  }
}

@DavidTurner Do You have sample exmaple configuration from /etc/security/limits.conf, because in theory should work with my settings. But it seems that even if we have docker eng in ver 20.10.13 something goes wrong but I don't know exactly how to troubleshoot.

BTW>In the earliest version of docker ulimits didn't propagate correctly in the container .
( (Docker Engine release notes | Docker Documentation))

Now for experimental I've decreased a number of nodes (leave only 1xmaster 1x coordinator 5x data ssd disk) x 3 hosts and all of configuration are managed by docker swarm. At least enable traffic through NGINX as LB to these coordinator nodes.

I'm wondering if xpack.security.transport.ssl.enabled=true has some impact on this behavior. In my case is not mandatory to use.

Elasticsearch says it's using fewer than 2000 file descriptors so the problem you're having doesn't look to be related to your file descriptor limits. I also don't think enabling TLS has any impact on the number of file descriptors in use.

Oh wait you have 18 nodes on a single host?! Why so many? Fewer/larger nodes is going to work better I think. If you're running them all as a single user on a single host then yes you will need to increase the limits to match.

Yes it was reached 18 nodes on the single host, such architecture has intention due to hardware:
5x 3.5T ssd disk
9x 2.5T hdd disk
so it was divided on:

1x coordinator node on ssd disk -8GBmem
2x master nodes on ssd disk (but I can reduce it to one per host) 8GBmem
5x data hot nodes on ssd disk -16GBmem
7x data warm nodes on hdd disk -8GBmem
3x data cold on hdd disk 8GBmem
-> so it gives 18 nodes per host

I can also rebuild for less nodes but I though about logic 1 node to 1 disk :slight_smile: or something else...
I also have a large memory reserve so it could be scalable

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.