How we can use es-hadoop in AWS EMR to connect external elasticsearch

Dear all ,

I have AWS EMR and ELK stack installed on some other server. Since Hadoop is already installed in EMR so how can we connect with Elasticsearch. How much ES-Hadoop help us in AWS EMR. What configuration we need for it .

Generally when running solutions in a cloud environment, it's best to make sure your clusters are accessible to each other on their private networks. This configuration is dependent on your cloud provider. ES-Hadoop allows you to set es.nodes.wan.only to disable any node discovery and filtering, which will power your job by sending and receiving data only though the cluster's gateway. Note that this solution can have detrimental effects on performance.

Thanks, James., yes my ELK stack is also in AWS ec2 instances in private network. I need to connect with elastic search and get the data into EMR for processing. Does ES-Hadoop help me with this requirement?

As mentioned above, there are two ways to access your data across cloud deployments:

  1. Configure your deployments to allow access across their private networks. This must be configured in your cloud vendor's platform. ES-Hadoop cannot do this for you.
  2. Use es.nodes.wan.only to pull and push data through the gateway for your cloud deployment of Elasticsearch.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.