@RainTown Thank you so much! You really made a superhuman effort to help a desperate random forum poster. Yes, -XX:+StartAttachListener does workaround the problem. I think that’s a wrap, assuming Elasticsearch isn’t going to revert the commit to change the working directory to the logs dir. I’ll quickly recap things here for posterity.
Normal attach mechanism
JVM doesn't have attach listener running
Attacher (in this case Elasticsearch) [indirectly?] creates either /proc/cwd/.attach_pid or /tmp/.attach_pid
JVM checks if .attach_pid file exists and, if so, starts the AttachListener thread and a .java_pid socket endpoint to which the attacher can attach.
My application’s problem:
Using a logs volume requested by a PVC serviced by an EFS CSI driver, the UID and GID of this volume may not be the Elasticsearch user, but instead, it may be 50000 or even a dynamically generated ID (see gidRangeStart/End). This also affects files created in this volume; files created here may be assigned 50000 user and group ownership, despite the Elasticsearch user having created the file.
In 8.19, Elasticsearch changed their working dir to use the logs dir.
Files created in the logs dir will have UID and GID ownership as defined in the storageclass configs, so the .attach_pid will have UID = GID = 50000, for example.
JVM code AttachListener::is_init_trigger() in attachListener_linux.cpp will fail the attachment handshake if the UID (e.g., 50000) of the file does not match the EUID of the process (i.e., Elasticsearch user).
What does -XX:+StartAttachListener do?
Starts the Attach Listener thread at JVM startup and also creates the .java_pid file at startup so the handshaking doesn't need to happen. The problematic .attach_pid doesn't need to get created and checked.