EntitlementBootstrap failure preventing startup: AttachNotSupportedException: Unable to open socket file

Several of our customer instances are hitting the following issue. Any idea what this is? It’s only happening in particular environments. I am currently trying to narrow down what is special about the environments and how to reproduce this and I will update this thread when more details come. I am just hoping that someone has some early thoughts because this seems to be quite problematic for us.

Problem started around the time of upgrade to a version around 8.19.3.

[2025-10-01T18:10:10,247][ERROR][org.elasticsearch.bootstrap.Elasticsearch] [<redacted-hostname>] fatal exception while booting Elasticsearch
java.lang.IllegalStateException: Unable to attach entitlement agent [<redacted-path>/elasticsearch/lib/entitlement-agent/elasticsearch-entitlement-agent-8.19.3.jar]
	at org.elasticsearch.entitlement.bootstrap.EntitlementBootstrap.loadAgent(EntitlementBootstrap.java:128) ~[elasticsearch-entitlement-8.19.3.jar:?]
	at org.elasticsearch.entitlement.bootstrap.EntitlementBootstrap.bootstrap(EntitlementBootstrap.java:103) ~[elasticsearch-entitlement-8.19.3.jar:?]
	at org.elasticsearch.bootstrap.Elasticsearch.initPhase2(Elasticsearch.java:250) ~[elasticsearch-8.19.3.jar:?]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:98) ~[elasticsearch-8.19.3.jar:?]
Caused by: com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file /proc/230/root/tmp/.java_pid230: target process 230 doesn't respond within 10500ms or HotSpot VM not loaded
	at sun.tools.attach.VirtualMachineImpl.<init>(VirtualMachineImpl.java:104) ~[jdk.attach:?]
	at sun.tools.attach.AttachProviderImpl.attachVirtualMachine(AttachProviderImpl.java:58) ~[jdk.attach:?]
	at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:207) ~[jdk.attach:?]
	at org.elasticsearch.entitlement.bootstrap.EntitlementBootstrap.loadAgent(EntitlementBootstrap.java:121) ~[elasticsearch-entitlement-8.19.3.jar:?]
	... 3 more

Never seen this one before, but it suggests something very odd about the setup in this environment. This is the path to a Unix domain socket that Elasticsearch’s JVM creates and then uses to connect back to itself. I don’t think creating it could have failed or else we wouldn’t have got this far, but something is preventing Elasticsearch from connecting to it. It may be permissions or security policy, or strange usage of containers/namespaces, or the lack of a proper /proc filesystem, or some nonstandard JVM options, or indeed a nonstandard entire JDK. Or something else entirely - there isn’t much to go on in this message.

Agreed, but there isn’t anything else in the log. Is there any additional logging we can enable to shed some light on this?

So far we’ve only seen this on gov instances where I guess things might be locked down or hardened.

I can’t think of any, at least nothing short of strace to look at the actual syscalls involved. You need to provide more details of the environment to identify anything that might be preventing access to this socket.

This suggests to me a configuration in selinux or AppArmor or similar tools, or your "app" is running inside a jail. You should be able to decide on the "nonstandard JVM" possibility fairly easily.

Well, if it started exactly with an upgrade to 8.19.3, then maybe something related changed in ES. You may be able to determine this from logs. But "around the time" of the upgrade is a little ambiguous. And we dont know anything about your environment or upgrade timings. e.g. Your "gov instances" might have got an update to VeryImportantSecurityWare X.Y around the same time.

Out of curiosity, you wrote that "our customer instances are hitting the following issue" and "this seems to be quite problematic for us.". What are you doing now to work around or mittigate the issue, if anything ?

well yeah probably the introduction of Entitlements :slight_smile:

Customers are unable to take in hotfixes / security patches because the system won’t even start up. They are reverting to before the upgrade to mitigate this.

You should be able to decide on the "nonstandard JVM

Our customers are in a k8s environment and are using the JRE we provide and cannot change it, since their container is, for the most part, read only. So I don’t think the problem is the JVM.

And we dont know anything about your environment

Yes, I’ll let you all know when we’ve narrowed down the environment problem. But for the most part, since this is k8s and we’re using containers (linux, amd64), the environments are identical compared to environments that are working.

We’ve had a customer get around the problem by changing the k8s securityContext to specify runAsUser and fsGroup to a very particular user but we’re trying to understand how this helps the problem because the user that’s being set isn’t even the elasticsearch server user nor is it an owner of any of the files in the container.

Sorry about being vague, but I’ll update you when we have more info.

Current leading theories:

  1. the path.logs dir has a bad permission (though nothing has changed since 8.18, so I don’t know in what way it could be bad)

or 2) Maybe related to https://bugs.openjdk.org/browse/JDK-8226919 though I’d have expected us and you all to have already encountered this bug. This seems to be present in open jdk versions <= 17.0.16. 17.0.17 is not out yet

1 Like

This seems to be present in open jdk versions <= 17.0.16. 17.0.17 is not out yet

You’re using 8.19.3 which bundles OpenJDK 24.0.2+12 which includes the fix for JDK-8226919.

1 Like

nothing has changed since 8.18

Except that 8.18 didn’t need to be able to connect back to its own attach port. Not that I can see how permission on a logs directory might affect this.

But @buitcj said

So it may not be the bundled version, and @buitcj makes specific reference to OpenJDK 17, which has a tick in compat matrix for 8.19.x, so ...

In passing, you may recall an entitlements error when path.repo was set to/? This also would not (to a normal user at least) have been obvious that this erroneous setting, which had been in place for ages apparently, would have caused the specific error.

@buitcj I appreciate its your systems, and you get to decide what to share, but the more ambiguous things are the harder it is for others, conversely the more you share the more you help us help you. Even "the path.logs dir has a bad permission" isn't particularly clear, the permissions are bad in what specific way? And, I guess, permissions can be changed/fixed, does doing so change anything? Of all the people who post here, @DavidTurner seems to me to be the most knowledgeable on the internals of the product/code and rarely stumped, and even he's also asked for a bit more detail if you can provide it.

OpenJDK 17, which has a tick in compat matrix for 8.19.x

Well yeah but that comes with a massive caveat as described in these docs:

Although such a configuration is supported, if you encounter a security issue or other bug in your chosen JVM then Elastic may not be able to help unless the issue is also present in the bundled JVM. Instead, you must seek assistance directly from the supplier of your chosen JVM.

We run some amount of testing with nonstandard JVMs but much less than with the bundled one, and definitely won’t claim compatibility with unusually locked-down environments either. Running a JVM other than the bundled one is a bad idea.

Small update:

Yes we use openjdk 17.0.16 which should be supported. To eliminate this as the cause of the problem, we removed this and used the jdk bundled with elasticsearch 8.19 (java 24, it looked like) and we still got the same error. So this rules out our second theory about the jdk bug I think? The bug I linked doesn’t mention java 24 was affected (though it doesn’t say java 17 was affected either despite them saying the fix got backported to java 17).


"the path.logs dir has a bad permission" isn't particularly clear, the permissions are bad in what specific way?

Ok I’ll try to explain as much as I think is relevant but this might end up being more confusing…

working: drwxrwsr-x 3 root <our group> 4096 Oct 7 17:49 logs
not working: drwx------ 3 50017 50017 6144 Oct 8 16:52 logs

So normally, we mount a logs volume and elasticsearch’s path.logs points to it. There are some times when we mount a different logs volume here (for app specific reasons) and the ownership is different. 50017 is a different user/group.

In 8.18.X, both mounts work and logging is fine which tells me that elasticsearch has no trouble writing to these places.

In 8.19.X we get an entitlement error using the one with 50017 50017 even though the log files are writable.

And I’m not exactly sure how the error message (Unable to open socket file /proc/230/root/tmp/.java_pid230: target process 230 doesn't respond within 10500ms or HotSpot VM not loaded) corresponds with this log dir.

I agree the jdk test seems to rule out the OpenJDK 17 unfixed bug.

To be clear, the entitlements error only shows up in 8.19.x when you are have set the path.logs to the 700 perm logs directory, owned by "some other user", where the uid does not even map to a username? i.e. IF you were to change path.logs to point to somewhere more "standard" then 8.19.x also works?

IF you were to change path.logs to point to somewhere more "standard" then 8.19.x also works?

Yes, I also did a test to only change path.logs to /tmp (i.e., not the normal path that has a volume mounted on it) then it works

Is that not mystery solved?

The other thread I recalled, with the dodgy setting of path.repo was similarly not obvious. I mean setting path.repo to / is just not a great idea, but it would not be at all obvious to Joe normal users that setting it to that would cause the Entitlements error. Similarly, your pointing of path.logs to a directory with, er, unexpected ownership/permissions ... ?

$ ls -lad /var/log/elasticsearch
drwxr-s---. 2 elasticsearch elasticsearch 12288 Oct  5 10:25 /var/log/elasticsearch

is the OOTB path.logs directory permissions I get on a RedHat box. Yours, the 50017 ones, are ... a lot different.

EDIT - If this is the root cause, the this permissions/ownership/entitlements thing generates an error around Unable to open socket file /proc/230/root/tmp/.java_pid230 is not an entirely helpful error to throw, I'd certainly second that!!

No, not yet “mystery solved”.

Our kubernetes pod has fsGroup: <our group> specified, so the mounted logs dir (whether it has root <our group> or 50017 50017) should have all files recursively take on the <our group> group ownership and so our user / group can modify everything under the logs mount. Long story short, our elasticsearch linux user does have permission to all files under this logs volume.

Again, our elasticsearch and logging appear to work in 8.18 and in 8.19 the entitlements check was added which I ASSUME is just “checks”. So it seems weird that our checks are failing even though things have always been working. Are the checks too aggressive maybe?

I’m also trying to get a better understanding of what the entitlement checks do.

Mmm. This is broadly the same story as the other thread, which I checked again:

Note that also led to a fatal error on startup.

In broadest sense, there's little point in "checking" if you are not going to do something based on the result of the checks. In this case it appears the ownership/permissions are considered an issue that also means a fatal startup error. Probably @DavidTurner or someone else will have to answer to clear up the doubt. Would it be a bug that this completely prevents startup? IMO yes, but without too much conviction. I do agree the way it fails, like the path.repo case, is not helpful or user-friendly from an admin perspective. And I also agree the set of "checks", and any consequences, should remain as consistent as possible through 8.x to 8.y upgrades, but it feels like that train has already left the station.

Update: I tried both java 21 and java 23 and we are still getting the entitlement error, so I think we can ignore this question/concern.

I forgot all about this post I created earlier where I noticed that the elasticsearch-entitlement jar does not have a version 17 (nor does it have a 24, fwiw. This may be relevant since I’ve only tried against java 17 and the pre-packaged java 24): 8.18 entitlement-search jar compatible with jre 17?

$ pwd
<...>/elasticsearch-8.19.4/lib/elasticsearch-entitlement-8.19.4/META-INF/versions
$ ls
19      20      21      22      23

This is just interesting at the moment, though it would be nice to get a response on that thread