Dear All,
Have a basic query on the port opening at firewall level for an ECE cluster including external LB
Suppose I have a very small setup with external LB, 3 each (proxy (3), director (3), ES master (3), kibana_server(3)) and rest servers are data nodes. Each allocators are holding these roles ie. allocator1 VM will have - ES_master1 container, allocator2 VM will have ES_master2 container and allocator3 VM will have ES_master3 container. Proxy & directors are separate instances. Rest other allocators will have different data nodes for example. In my scenario all my internal ports are open for example, so I need to only think about external ports opening only from any source to ECE. Appreciate your help
For all incoming remote log data from any source via beats agents to ECE cluster.
A) do we specify the "proxy-hostname:ES_port" (proxyserver1:9243, proxyserver2:9243, proxyserver3:9243) in the Elasticsearch output config of all the beats agents or ?
I assume if we mention proxy server details then all those incoming log data will be automatically forwarded internally from proxy servers to all available elasticsearch cluster. Is my assumption correct?
B) or we need to mention "ES_Hostname:9243" directly (es_hostname1:9243, es_hostname2:9243, es_hostname3:9243) in the Elasticsearch output config of all the beats agents or ?
If we mention elasticsearch host and port in the beats config then what will be the role of Proxy & LB in ECE cluster for all incoming connections,
C) we need to mention the "LB:9243" (LB1:9243, LB2:9243) in the Elasticsearch output of all the beats agents ?
Is it correct to mention LB:port in the beats config under elasticsearch output? because I assume the request will hit LB, then it will forward it to proxy and then further internally to elasticsearch
Kibana port:
Accessing kibana portal from any source, should we hit LB:9343 - I assume the LB further will redirect the request to proxy and later to kibana automatically and no need fo any manual changes required, is my understanding correct? Basically opening firewall ports at LB level only
LB:9434 --> proxy:9343 --> Kibana:9343.
The standard process for sending data to a specific ES cluster (with cluster id xxx) is to have a wildcard DNS entry (*.lb-cname -> lb-cname), where lb-cname is the host of the load balancer (eg which round robins proxyserverN).
Then you would configure beats to hit xxx.lb-cname:9243
(You can instead hit lb-cname:9243 but with one of various headers, eg Host: xxx or X-Found-Cluster: xxx)
If you don't specify xxx anywhere the request will simply fail at the ECE proxy (because it won't know which cluster to send to, it will never broadcast requests)
Kibana also listens on 9243 and the process is identical to hitting ES (ie Kibana has a different id yyy and you'd connect to yyy.lb-cname:9243)
9343 is for the (deprecated) native transport protocol, which most people don't use
Thanks for your reply Alex. Honestly I did not understood much about what this means xxx.lb-cname, yyy.lb-cname and only lb-cname. But I am sure I will be using same proxy and directors to manage another cluster within that environment.
Where can I see the lb-cname --> is it the LB VIP IP or the LB FQDN ?
When sources like beats or logstash or any other with SSL enabled sends request to elasticsearch for indexing, then does it uses only 9243 or both 9243 and 9343?
And the same for SSL enabled if anyone want to access kibana then is it 9343 or again 9243?
As an example, I have a cluster with id 9c145b1289a37d5e9076a3fac24d5d7f. It also has a Kibana with id 241d53848745a44ac812912e13af4f234 (the only place you can see that is from the "Launch Kibana/Copy Kibana Endpoint" elements under the deployment page.
I have a load balancer with DNS name us-east-1.aws.found.io, ie that points to a load balancer that will round-robin to my proxy hosts.
The DNS is set up so that *.us-east-1.aws.found.io resolves to us-east-1.aws.found.io (eg google for wildcard DNS for more details on this if necessary)
Then to make a request to the ES cluster, I would use the host 9c145b1289a37d5e9076a3fac24d5d7f.us-east-1.aws.found.io:9243; and to connect to the Kibana I would use 241d53848745a44ac812912e13af4f234.us-east-1.aws.found.io:9243
On the Admin and cloud access side I see 12300, 12343, 12400, 12443. For secured access again which all ports needs to opened while accessing from outside?
And what will be URL for Admin & Cloud UI as per your above example?
Another thing, for logstash receiving end for beats the ports need to opened are 5044 or anything more?
9c145b1289a37d5e9076a3fac24d5d7f = ES / 241d53848745a44ac812912e13af4f234 = Kibana
Yep in my example. Of course these are just two random strings, the important point being there's two of them for each deployment (the deployment, ie collection of ES/Kibana/APM/etc has yet another id like the above, but it's only used for in the ECE control plane API/UI)
So only 9243 will be used across all beats as well as logstash configs for destination elasticsearch hostnames
Correct
And what about other ports like 9200, 9300, 9343?
9200 is just the HTTP version of 9243
9300/9343 are for native client (which you are probably not using)
On the Admin and cloud access side I see 12300, 12343, 12400, 12443. For secured access again which all ports needs to opened while accessing from outside?
In practice you will only need 12400 and 12443 since /api:124xx gets routed to /api:123xx, the 123xx ports are just there for legacy reasons
And what will be URL for Admin & Cloud UI as per your above example
Normally you'd configure a load balancer to round robin to adminconsole-host1:12443, adminconsole-host2:12443 etc.
The choice of DNS name for this load balancer is up to you. Some people re-use the proxy LB host (so eg us-east-1.aws.found.io:12443 would route to the UI/API), some people create a separate DNS entry (eg console.found.io:12443)
Another thing, for logstash receiving end for beats the ports need to opened are 5044 or anything more?
That I don't know, logstash is deployed separately to ES. If that's the port the beats are connecting to, you shouldn't need any other ports - but I'm not a LS set-up expert so don't quote me on that
Does this routing happens automatically at the proxy level?
So in short, below are the ports needed to be opened from any source to ECE cluster at firewall :
9200 and 9243 - HTTP and HTTPS (SSL) Elasticsearch & Kibana
12400 and 12443 - HTTP and HTTPS (SSL) for Admin and Cloud UI
In the above URL it states that :
"The first host you install ECE on initially requires the ports for all roles to be open, which includes the ports for the coordinator, allocator, director, and proxy roles. After you have brought up your initial ECE installation, only the ports for the roles that the initial host continues to hold need to remain open. Before installing a runner, make sure that the 20000, 21000, 22000 ports are open for the installation script checks."
Does the above mean the "Inbound Traffic from any Source (internet or Intranet)" or the "Inbound traffic from Internal components of ECE Cluster" ?
If it is true for "Inbound Traffic from any Source (internet or Intranet)" then, the below ports need to be opened at firewall level correct?
12300(HTTP) ,12343(HTTPS), 12400(HTTP), 12443(HTTPS), 9200(HTTP), 9243(HTTPS), 20000(HTTP), 21000(HTTP), 22000(HTTP)
Kindly help me clarify
But still unsure of ports - 20000(HTTP), 21000(HTTP), 22000(HTTP) - whether they need to be opened for inbound from any source or outbound from ECE for the install script checks.
Does this routing happens automatically at the proxy level?
The UI server which listens on 124xx proxies /api to the API yes
So in short, below are the ports needed to be opened from any source to ECE cluster at firewall :
9200 and 9243 - HTTP and HTTPS (SSL) Elasticsearch & Kibana
12400 and 12443 - HTTP and HTTPS (SSL) for Admin and Cloud UI
If it is true for "Inbound Traffic from any Source (internet or Intranet)" then, the below ports need to be opened at firewall level correct?
12300(HTTP) ,12343(HTTPS), 12400(HTTP), 12443(HTTPS), 9200(HTTP), 9243(HTTPS), 20000(HTTP), 21000(HTTP), 22000(HTTP)
More than these, eg line 6 of Inbound traffic from internal components of ECE has 18000-18999/20000-20999
Specifically 20000/21000/22000 are just required for the install script, but the reason they are required is to sanity check that you haven't blocked off port range that are required (such as the example given above)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.