Using Elasticsearch Spark adapter in Jupyter notebooks with Python kernel

michele_crudele · November 27, 2015, 1:04pm

Hi,

I used in the past the elasticsearch spark adapter in Jupyter notebooks with scala kernel adding the dependencies with the %AddJar file:///.../elasticsearch-spark_2.10-2.1.0.BUILD-SNAPSHOT.jar

I need to port my notebooks to Python, using the Python kernel. Is the Python binding available for elasticsearch ? And if so, how can I specify the dependency in the notebook ? (%AddDeps and %AddJar not available for python kernel).
I'd be grateful if you can point me to any documentation available / sample Jupyter notebook that can help me.

Thanks alot,

Michele

costin · November 27, 2015, 1:23pm

ES-Hadoop/Spark is available only for the JVM, there's no native Python binding for it.
I'm not familiar enough with Python however you could work with ES by relying on the Input/OutputFormat; that is by pulling in the Map/Reduce layer as explained here.
Note this is still standard Spark and in fact, it is Spark that picks up the formats and uses it internally.

michele_crudele · November 27, 2015, 4:41pm

Thanks Costin, I'll try the mapreduce layer.
What are the benefits of using it in comparison with direct usage of
elasticsearch-py python library in my notebooks?
Il 27/nov/2015 02:33 PM, "Costin Leau" noreply@discuss.elastic.co ha
scritto:

costin · December 8, 2015, 2:04pm

The docs [cover] this aspect as well.

Topic		Replies	Views
How to run ES-Hadoop in Jupyter Notebook (Python or Scala) Elasticsearch es-hadoop	3	2243	July 7, 2018
Jupyter spark connect to elasticsearch Elasticsearch docker , es-hadoop	12	1439	March 29, 2023
Using elasticsearch with jupyter notebook Elasticsearch	1	4349	August 28, 2018
Pyspark-Elasticsearch connectivity and latest version compatibilty Elasticsearch es-hadoop	7	2352	March 24, 2023
ES-Hadoop PySpark error Elasticsearch es-hadoop	2	2170	January 10, 2018

Using Elasticsearch Spark adapter in Jupyter notebooks with Python kernel

Related topics