Unable to push data into ES from Apache Spark


(Gaurav6351) #1

I have created a jar file using sbt package and using that jar file to run my code .

bin/spark-submit --jars /home/clogeny/analyser/reference-apps/logs_analyzer/chapter1/scala/target/scala-2.10/spark-logs-analyzer_2.10-1.0.jar --class "com.databricks.apps.logs.chapter1.LogAnalyzer" /home/clogeny/analyser/reference-apps/logs_analyzer/data/apache.access.log
16/01/25 22:06:26 WARN Utils: Your hostname, clogeny-Inspiron-3442 resolves to a loopback address: 127.0.1.1; using 192.168.0.104 instead (on interface wlan0)
16/01/25 22:06:26 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
at com.databricks.apps.logs.chapter1.LogAnalyzer$.main(LogAnalyzer.scala:23)
at com.databricks.apps.logs.chapter1.LogAnalyzer.main(LogAnalyzer.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


And when i dont pass --jars option and fire command as

bin/spark-submit --class "com.databricks.apps.logs.chapter1.LogAnalyzer" /home/clogeny/analyser/reference-apps/logs_analyzer/chapter1/scala/target/scala-2.10/spark-logs-analyzer_2.10-1.0.jar/home/clogeny/analyser/reference-apps/logs_analyzer/data/apache.access.log

it say

Exception in thread "main" java.lang.NoClassDefFoundError: org/elasticsearch/spark/package$
at com.databricks.apps.logs.chapter1.LogAnalyzer$.main(LogAnalyzer.scala:33)
at com.databricks.apps.logs.chapter1.LogAnalyzer.main(LogAnalyzer.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.elasticsearch.spark.package$
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 11 more


(Daniel Mitterdorfer) #2

Hi,

this is not related to Elasticsearch but related to your demo program. For the first invocation it says right in the trace:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
	at com.databricks.apps.logs.chapter1.LogAnalyzer$.main(LogAnalyzer.scala:23)

So in your program you access an array, probably command line args, but have not provided any on the command line.

The second invocation is missing necessary classes on the classpath and simply cannot work.

Daniel


(system) #3