[HADOOP]Connection and configuration Elastic- Hadoop

Hi all,

This is my first message on the elastic community. I'm also new to elasticsearch, and little bit more skilled on Hadoop-Spark system.

Now, I need to bind Elasticsearch and logstash wth the Hadoop ecosystem, and still fail to get an up and running configuration.

I read many time the official "elasticsearch-hadoop" documentation, but I'm still not able to locate exactly where insert my es configuration script.

This is my actual config:

Hadoop: ver 2.10.0
Spark: vers 3.0
Elasticsearch: ver 7.8.1
Lgstash: ver 7.81

This is my (NOT WORKING) configuration: Downloaded tarball "elasticsearch-hadoop-7.8.1.zip" and added this record to the ".bashrc" user file:

export ES_LIB="/absolute/path/to/elasticsearch-hadoop-7.8.1/dist";
export PATH=$PATH:$ES_LIB

Then I tried to modify the "hadoop-core-site.xml" file.
The configuration file of hadoop now lists those properties:

<configuration>
	<property>
		<name>fs.defaultFS</name>
		<value>hdfs://127.0.0.1:9000</value>
	</property>
	<property>
		<name>hadoop.tmp.dir</name>
		<value>/opt/hadoop/tmp</value>
		<description>A base for other temporary directories</description>
	</property>
	<property>
		<name>hive.server2.enable.doAs</name>
		<value>false</value>
	</property>
	<property>
		<name>hadoop.proxyuser.hadoop.groups</name>
		<value>*</value>
	</property>
	<property>
		<name>hadoop.proxyuser.hadoop.hosts</name>
		<value>*</value>
	</property>
	<property>
		<name>es.resource</name>
		<value>index/html</value>
	</property>
	<property>
		<name>es.nodes</name>
		<value>localhost</value>
	</property>
	<property>
		<name>es.port</name>
		<value>9200</value>
	</property>
</configuration>

Biìut I'm still not able to connect Hadoop ad Elasticsearch.

Could you please say me, where my configuration lacks? Where do I exactly am supposed to put the configuration files?

Thank you all so much

My new attempt is using sbt:

Downloaded "sbt", configured with custom build file, that looks like this:

name := "EstraiCW"
version := "1.0"
sbtVersion := "1.3.13"
scalaVersion := "2.12.10"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "3.0.0" % "provided"
Compile / scalaSource := baseDirectory.value / "/home/leonardo/MioArchivio/Progetti-Scala/EstraiCW.scala"
unmanagedJars in Compile += file("/home/leonardo/MioArchivio/Programmi/Elasticsearch-Hadoop/elasticsearch-hadoop-7.8.1/dist/elasticsearch-hadoop-7.8.1.jar")
unmanagedJars in Compile += file("/home/leonardo/MioArchivio/Programmi/Elasticsearch-Hadoop/elasticsearch-hadoop-7.8.1/dist/elasticsearch-hadoop-hive-7.8.1.jar")
unmanagedJars in Compile += file("/home/leonardo/MioArchivio/Programmi/Elasticsearch-Hadoop/elasticsearch-hadoop-7.8.1/dist/elasticsearch-spark-20_2.11-7.8.1.jar")

Now, I want to submit a new Spark-job with this generated jar.

No idea?? Anyone ever binded Elastic to Hadoop?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.