From my understanding, adding HaProxy on top of Elasticsearch might bring some High Availability, and Load Balancing. Moreover, it is also possible to communicate a single entry point URL (the Ha Proxy host endpoint), for every users/developers who want to work with ES.
However, the result is that the Elasticsearch driver is unable to resolve/discover the network topology by itself, and is unable to connect directly to the data nodes.
Additionnally, regarding, es-hadoop, you need to set es.nodes.wan.only to true with Ha Proxy. And the documentation say about it :
es.nodes.wan.only (default : false)
If true, the connector disables discovery and only connects through the declared es.nodes during all operations, including reads and writes. Note that in this mode, performance is highly affected.
Could you please clarify, if Ha Proxy on top of Elasticsearch is a good practice (or not) ?