I'd like to setup Nginx proxy to secure my ES Cluster. But I'm stuck with setting Nginx proxy + ES + Spark.
I bound network.host
to _local_
to prevent connecting directly without proxy.
http.port: 9200
transport.tcp.port: 9300
network.host: _local_
transport.host: _non_loopback:ipv4_
And configuration of nginx is below (nothing different from the es manual):
server {
listen 19200;
server_name localhost;
location / {
proxy_pass http://localhost:9200/;
proxy_http_version 1.1;
proxy_set_header Connection "Keep-Alive";
proxy_set_header Proxy-Connection "Keep-Alive";
}
}
Every node has same settings like belows and All ES plugins (head/marvel/kibana) and my python script worked well after adding proxy.
+------------------+ +------------------+ +------------------+
|Node 1 | |Node 2 | |Node n |
+------------------+ +------------------+ +------------------+
| Nginx | | Nginx | . | Nginx |
|(listen on 19200) | |(listen on 19200) | . |(listen on 19200) |
| | | | . | |
| Elasticsearch | | Elasticsearch | | Elasticsearch |
|(listens on 9200) | |(listens on 9200) | |(listens on 9200) |
+------------------+ +------------------+ +------------------+
I've changed Spark configuration like this (without nginx proxy my spark worked well):
conf.set("spark.driver.allowMultipleContexts", "true")
conf.set("es.index.auto.create", "true")
conf.set("es.nodes.discovery", "true")
conf.set("es.nodes", "es_hostname:19200")
But after setting a proxy, I'm getting an Exception org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException
I enabled a TRACE
logging level and I realized that Spark tried to connect 127.0.0.1:9200
or public_ip:9200
. That's why I'm getting an NoNodesLeftException
.
After reading [ES Hadoop Configuration], I found es.net.proxy.http.host
and es.net.proxy.http.port
. These settings made spark work well. But the problem is that ONLY the proxy node are receiving _bulk
requests, so it's incoming traffic is very high.
Question.
Can I conclude that es.net.proxy.http.(host,port)
is the only way to use (proxy + ES + Spark)? Is there another way to use nginx proxy and Spark?
I'm using ES 2.3.2 and es hadoop 2.3.2.
Thanks.