ES-HIVE where predicates not being utilized

Cris_Favero · May 20, 2017, 12:02am

I am trying to connect Elasticsearch through Hive to utilize for its quick full-text searches primarily.

I have created the table

CREATE EXTERNAL TABLE scorecard (
  item_id BIGINT,
  description STRING,
  description_mod1 STRING,
  price_mod1 DOUBLE,
  price_mod2 DOUBLE
)
 STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES( 'es.resource' = 'fulltextsearch/scorecard',
'es.nodes'='elasticsearch',
'es.net.http.auth.user'='user',
'es.net.http.auth.pass'='pass'
);

For reference, my index mapping was created with the following JSON

{
    "mappings": {
        "scorecard": {
            "properties": {
                "item_id": {
                    "type": "long"
                },
                "description": {
                    "type": "text"
                },
                "description_mod1": {
                    "type": "text"
                },
                "price_mod1": {
                    "type": "double"
                },
                "price_mod2": {
                    "type": "double"
                }
            }
        }
    }
}

I have successfully loaded about a million rows into the index. My issue comes with trying to selectively query rows. For something such as select * from scorecard limit 5 the results are returned immediately as expected. However when attempting to do even a simple where query such as select * from scorecard where item_id ='123' or select * from scorecard where description = 'exact text' appear in the hive debug log to be pulling in the full data set into Hive first rather than the expected transcribing of query parameters to es-hadoop.

Am I missing something from the configuration to force the predicates to be transcribed?

james.baiera · May 20, 2017, 6:48pm

ES-Hadoop does not currently support transcribing hive "WHERE" predicates into query DSL and pushing them down to the server. Right now the only pushdown support that we have is in the SparkSQL integration as it is the only one that exposes adequate API hooks for capturing and converting predicates.

system · June 17, 2017, 6:48pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch hive query Elasticsearch es-hadoop	2	818	August 14, 2017
Specify es.query condition in HIVE SQL query? Elasticsearch es-hadoop	4	1413	July 6, 2017
Hive - Is it possible to use external table on top of ES index for arbitrary FTS Elasticsearch es-hadoop	2	1055	July 6, 2017
"Cannot specify a query in the target index and through es.query" when working with ES, Wikipedia River and Hive Elasticsearch	1	552	July 6, 2017
Hive SQL term query Elasticsearch es-hadoop	2	1094	July 6, 2017

ES-HIVE where predicates not being utilized

Related topics