I upgraded Elasticsearch -- no difference.
However I seem to have come up with a partial solution.
The issue goes away when I remove the publication of port 9200 on the first service es01:
ports:
- 9200:9200
What seems to happen is including this directive causes docker stack to an additional network to each of the containers. Somehow Elasticsearch gets confused about which IP address to communicate with. This is shown in the logs above by this line:
p3es_es01.1.t417ccm7zh6h@boron | "Caused by: org.elasticsearch.transport.ConnectTransportException: [es01][10.0.0.35:9300] connect_exception"
The service es01 is not located at 10.0.0.35. Attempting to initiate connection port 9300 (or 9200) at that address is rejected.
I can't tell if this is an Elasticsearch issue or a Docker issue.
For me, not publishing port 9200 is fine because I always intended to confine that traffic to the private overlay network anyway for security. However this might not be OK with someone else attempting to do this, so I it might be worth someone at Elasticsearch to come up with a more robust solution. It might simply be a configuration option to direct Elasticsearch to use the correct IP address, so the fix is simply a documentation update.
Thanks for the help, and hopefully this e-mail thread will save someone else the trouble.