ODBC connection to ES

joris_renkens · November 27, 2015, 9:50am

Hi,

I'm trying to get an ODBC connection to ES which pushes down as much as possible to ES. I'm trying to use the spark-sql thrift-server and the es-hadoop spark-sql connection but I'm having a hard time getting it to work.

How crazy is this idea?
Is it possible to get this to work?
Are there better ways to get an ODBC connection to ES?

Thanks,
Joris

costin · December 8, 2015, 1:50pm

There will be a lot of wrapping and multiple layers that each call to ES will have to go through.

ES doesn't offer any ODBC support per se so using something like Spark, would give you this. Does it work? In theory yes, in practice probably especially since thrift-server is involved.

1111 · December 22, 2015, 6:32am

Can you do an ODBC dirver for ES like MongoDB does.

It can me simple one: odbc driver take in query json request, and gives out json -> table
where fields of table are json fields with dots if nested

field1 | field2.nested1 | field2.nested2 | field3 etc

if field in document is empty it replaces with null in table

?

ebuildy · December 29, 2015, 8:22pm

I am using this with Tableau, I had to "flat" my data (remove nested JSON), then with Spark HiveThriftServer2 I can connect from Tableau. It's not crazy and very easy to setup.

But it can be a bit slow, because Spark has to fetch all data from ES via HTTP before processing the SQL query. Be sure that Spark workers are located close to ES nodes, also check "CACHE TABLE" from SparkSQL.

Another way is to export your ES data to a big JSON file, and load this file with Spark. This is the best way to play with, no nested JSON issue, just pleasure

Don't forget to add the elastichsearch-hadoop JAR in the job class path, here how I am running the hive server:

/usr/local/spark/bin/spark-submit  --class                  org.apache.spark.sql.hive.thriftserver.HiveThriftServer2     \
				   --driver-class-path     /opt/jars/*
			           --master                spark://localhost:7077

elasticsearch-hadoop.jar is located in /opt/jars in my install.

Topic		Replies	Views
ODBC + Hive + Elasticsearch doesn't seem to work Elasticsearch es-hadoop	12	1895	July 6, 2017
Hive overwhelming Elasticsearch Elasticsearch es-hadoop	24	1433	May 18, 2021
Multiple ES clusters in SparkSQL Elasticsearch es-hadoop	9	2876	July 6, 2017
Elasticsearch nested objects and parent child relationship Elasticsearch es-hadoop	2	1386	July 6, 2017
Connection error when selecting from ES table in Hive Elasticsearch es-hadoop	4	1304	July 11, 2017

ODBC connection to ES

Related topics