Does ElasticSearch for Hadoop work with highlighting?

drippel · May 19, 2015, 12:38pm

We observed that EsRDD returns only the _source field instead of the whole Hit, so the highlighting field stays outside the result.

costin · May 20, 2015, 9:09am

Can you explain what is that you are seeing and what would you like to see? Currently only the source is used since that's the data that can be both read/written from/to Spark.
Metadata can be returned as well - potentially we can expand this to include additional fields as well.

drippel · May 20, 2015, 10:09am

Hi Costin,

Thank you for answering (BTW - I spoke with you at the Elastic[ON]15 after your lecture ).

What we are trying to do here is to run a query with a highlighting request.

When we run the following request in Elastic Head:

{
  "from" : 0,
  "size" : 50,
  "query" : {
    "bool" : {
      "must" : {
        "match" : {
          "_all" : {
            "query" : "asia",
            "type" : "boolean"
          }
        }
      },
      "must_not" : {
        "match" : {
          "_all" : {
            "query" : "goat",
            "type" : "boolean"
          }
        }
      }
    }
  },
  "post_filter" : {
    "term" : {
      "_type" : "Document"
    }
  },
  "highlight" : {
    "pre_tags" : [ "<b>" ],
    "post_tags" : [ "</b>" ],
    "fragment_size" : 0,
    "number_of_fragments" : 0,
    "fields" : {
      "*" : { }
    }
  }
}

We get:

{
  "took" : 37,
  "timed_out" : false,
  "_shards" : {
    "total" : 165,
    "successful" : 165,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.73102,
    "hits" : [ {
      "_index" : "fts-english",
      "_type" : "Document",
      "_id" : "id4",
      "_score" : 0.73102,
      "_source":{"_analyzer":"english","streamId":3,"postDate":"2013-01-30","language":"English","message":"Mongolians migrated from Mid-Asia to the asian shores around 15,000 years ago.","user":"American"},
      "highlight" : {
        "message" : [ "Mongolians migrated from Mid-<b>Asia</b> to the asian shores around 15,000 years ago." ]
      }
    }, {
      "_index" : "fts-english",
      "_type" : "Document",
      "_id" : "id2",
      "_score" : 0.6265886,
      "_source":{"_analyzer":"english","streamId":1,"postDate":"2013-01-30","language":"English","message":"Paleoindians migrated from Asia to what is now the helloworld@gmail.com mainland around 15,000 years ago.","user":"me"},
      "highlight" : {
        "message" : [ "Paleoindians migrated from <b>Asia</b> to what is now the helloworld@gmail.com mainland around 15,000 years ago." ]
      }
    }, {
      "_index" : "fts-english",
      "_type" : "Document",
      "_id" : "id8",
      "_score" : 0.6265886,
      "_source":{"_analyzer":"english","streamId":1,"postDate":"2013-01-30","language":"English","message":"Indians migrated from Asia to North America long time ago. Many years before Columbus.","user":"me"},
      "highlight" : {
        "message" : [ "Indians migrated from <b>Asia</b> to North America long time ago. Many years before Columbus." ]
      }
    } ]
  }
}

which includes the highlight section where the relevant words in the original source text are highlighted by Bold <b> </b> html tags.

But - when we submit the same query to Elastic Hadoop, the EsRDD returned contains only the "_source" section but not the "highlights" for each returned hit.

We hoped that there would be a configurable option to get the whole result including the highlighting field.

Thank you for your help!

Doron

drippel · August 8, 2016, 8:56am

This issue is still not solved even with later versions. This is a very basic functionality. I wonder why nobody else is complaining - is nobody using es-hadoop with spark?

Topic		Replies	Views
Highlighting issues Elasticsearch	5	809	January 24, 2017
Highlight in the field response Elasticsearch	4	211	July 3, 2023
Highlighting in Elasticsearch Elasticsearch	5	701	December 8, 2017
Is there any option for highlight to highlight the results in "_source"? Elasticsearch	3	412	April 20, 2019
Return specific field and highlights via Java API Elasticsearch	2	1753	July 6, 2017

Does ElasticSearch for Hadoop work with highlighting?

Related topics