ES-HIVE where predicates not being utilized

(Cris Favero) #1

I am trying to connect Elasticsearch through Hive to utilize for its quick full-text searches primarily.

I have created the table

  item_id BIGINT,
  description STRING,
  description_mod1 STRING,
  price_mod1 DOUBLE,
  price_mod2 DOUBLE
 STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES( 'es.resource' = 'fulltextsearch/scorecard',

For reference, my index mapping was created with the following JSON

    "mappings": {
        "scorecard": {
            "properties": {
                "item_id": {
                    "type": "long"
                "description": {
                    "type": "text"
                "description_mod1": {
                    "type": "text"
                "price_mod1": {
                    "type": "double"
                "price_mod2": {
                    "type": "double"

I have successfully loaded about a million rows into the index. My issue comes with trying to selectively query rows. For something such as select * from scorecard limit 5 the results are returned immediately as expected. However when attempting to do even a simple where query such as select * from scorecard where item_id ='123' or select * from scorecard where description = 'exact text' appear in the hive debug log to be pulling in the full data set into Hive first rather than the expected transcribing of query parameters to es-hadoop.

Am I missing something from the configuration to force the predicates to be transcribed?

(James Baiera) #2

ES-Hadoop does not currently support transcribing hive "WHERE" predicates into query DSL and pushing them down to the server. Right now the only pushdown support that we have is in the SparkSQL integration as it is the only one that exposes adequate API hooks for capturing and converting predicates.

(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.