I looked at syscall traces from 8.18.0 startup vs 8.19.5 startup, of course a zillion differences.
One thing I noticed is that very early on in boot process it tries to effectively do log rotation on gc.log.
8.18.0
11:36:09.152514 stat("/var/log/elasticsearch/gc.log", {st_mode=S_IFREG|0644, st_size=3048, ...}) = 0
11:36:09.152560 stat("/var/log/elasticsearch/gc.log", {st_mode=S_IFREG|0644, st_size=3048, ...}) = 0
11:36:09.152604 stat("/var/log/elasticsearch/gc.log.00", {st_mode=S_IFREG|0644, st_size=3056, ...}) = 0
...
8.19.5
11:41:11.775613 stat("gc.log", {st_mode=S_IFREG|0644, st_size=40452, ...}) = 0
11:41:11.775655 stat("gc.log", {st_mode=S_IFREG|0644, st_size=40452, ...}) = 0
11:41:11.775701 stat("gc.log.00", {st_mode=S_IFREG|0644, st_size=3056, ...}) = 0
11:41:11.775740 stat("gc.log.00", {st_mode=S_IFREG|0644, st_size=3056, ...}) = 0
11:41:11.775776 stat("gc.log.00", {st_mode=S_IFREG|0644, st_size=3056, ...}) = 0
...
Note 8.19.5 doesn’t use the full paths, it’s already done a chdir to /var/log/elasticsearch by this point, which is different from 8.18.0
Suggestion for @buitcj Try pointing the location of the gc.log files to somewhere outside path.logs, e.g. /tmp, in jvm.options and see if it changes anything? Also, can you check if the user running elasticsearch can do a chdir into your equiv of /var/log/elasticsearch, as in 8.18.0 it didn’t seem to need to do so but in 8.19.5 it seems it does.
Looking at calls involving the .java_pid / .attach_pid:
8.18.0
11:36:24.669061 stat("/proc/3519336/root/tmp/.java_pid3519336", 0x73e7e27fe340) = -1 ENOENT (No such file or directory)
11:36:24.669145 openat(AT_FDCWD, "/proc/3519336/cwd/.attach_pid3519336", O_RDWR|O_CREAT|O_EXCL, 0666) = -1 EACCES (Permission denied)
11:36:24.669365 openat(AT_FDCWD, "/proc/3519336/root/tmp/.attach_pid3519336", O_RDWR|O_CREAT|O_EXCL, 0666) = 53
11:36:24.669725 readlink("/tmp/.attach_pid3519336", 0x73e7e27fbea0, 1023) = -1 EINVAL (Invalid argument)
11:36:24.670807 stat(".attach_pid3519336", 0x73e7b4cfda20) = -1 ENOENT (No such file or directory)
11:36:24.670913 stat("/tmp/.attach_pid3519336", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
11:36:24.672500 unlink("/tmp/.java_pid3519336.tmp") = -1 ENOENT (No such file or directory)
11:36:24.672575 bind(53, {sa_family=AF_UNIX, sun_path="/tmp/.java_pid3519336.tmp"}, 110) = 0
11:36:24.672703 chmod("/tmp/.java_pid3519336.tmp", 0600) = 0
11:36:24.672839 chown("/tmp/.java_pid3519336.tmp", 119, 125) = 0
11:36:24.672888 rename("/tmp/.java_pid3519336.tmp", "/tmp/.java_pid3519336") = 0
8.19.5
11:41:26.318501 stat("/proc/3544717/root/tmp/.java_pid3544717", 0x76c6e07fe320) = -1 ENOENT (No such file or directory)
11:41:26.318594 openat(AT_FDCWD, "/proc/3544717/cwd/.attach_pid3544717", O_RDWR|O_CREAT|O_EXCL, 0666) = 53
----> HERE
11:41:26.319543 readlink("/var/log/elasticsearch/.attach_pid3544717", 0x76c6e07fbe90, 1023) = -1 EINVAL (Invalid argument)
----> HERE
11:41:26.321214 stat(".attach_pid3544717", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
11:41:26.323904 unlink("/tmp/.java_pid3544717.tmp") = -1 ENOENT (No such file or directory)
11:41:26.323972 bind(53, {sa_family=AF_UNIX, sun_path="/tmp/.java_pid3544717.tmp"}, 110) = 0
11:41:26.324089 chmod("/tmp/.java_pid3544717.tmp", 0600) = 0
11:41:26.324210 chown("/tmp/.java_pid3544717.tmp", 119, 125) = 0
11:41:26.324260 rename("/tmp/.java_pid3544717.tmp", "/tmp/.java_pid3544717") = 0
Note the readlink I highlighted, thats not in the 8.18.0 trace.
In 8.19.5 I see these chdir calls on startup:
11:41:09.273311 chdir("/tmp/final-flags5522972383749294867") = 0
11:41:10.340103 chdir("/tmp/final-flags17689081138414072901") = 0
11:41:11.762493 chdir("/var/log/elasticsearch") = 0
11:41:53.267101 chdir("/usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin") = 0
in 8.19.0 I just saw this:
11:36:52.759074 chdir("/usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin") = 0