Does ElasticSearch for Hadoop work with highlighting?

Hi Costin,

Thank you for answering (BTW - I spoke with you at the Elastic[ON]15 after your lecture :slight_smile: ).

What we are trying to do here is to run a query with a highlighting request.

When we run the following request in Elastic Head:


{
  "from" : 0,
  "size" : 50,
  "query" : {
    "bool" : {
      "must" : {
        "match" : {
          "_all" : {
            "query" : "asia",
            "type" : "boolean"
          }
        }
      },
      "must_not" : {
        "match" : {
          "_all" : {
            "query" : "goat",
            "type" : "boolean"
          }
        }
      }
    }
  },
  "post_filter" : {
    "term" : {
      "_type" : "Document"
    }
  },
  "highlight" : {
    "pre_tags" : [ "<b>" ],
    "post_tags" : [ "</b>" ],
    "fragment_size" : 0,
    "number_of_fragments" : 0,
    "fields" : {
      "*" : { }
    }
  }
}

We get:


{
  "took" : 37,
  "timed_out" : false,
  "_shards" : {
    "total" : 165,
    "successful" : 165,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.73102,
    "hits" : [ {
      "_index" : "fts-english",
      "_type" : "Document",
      "_id" : "id4",
      "_score" : 0.73102,
      "_source":{"_analyzer":"english","streamId":3,"postDate":"2013-01-30","language":"English","message":"Mongolians migrated from Mid-Asia to the asian shores around 15,000 years ago.","user":"American"},
      "highlight" : {
        "message" : [ "Mongolians migrated from Mid-<b>Asia</b> to the asian shores around 15,000 years ago." ]
      }
    }, {
      "_index" : "fts-english",
      "_type" : "Document",
      "_id" : "id2",
      "_score" : 0.6265886,
      "_source":{"_analyzer":"english","streamId":1,"postDate":"2013-01-30","language":"English","message":"Paleoindians migrated from Asia to what is now the helloworld@gmail.com mainland around 15,000 years ago.","user":"me"},
      "highlight" : {
        "message" : [ "Paleoindians migrated from <b>Asia</b> to what is now the helloworld@gmail.com mainland around 15,000 years ago." ]
      }
    }, {
      "_index" : "fts-english",
      "_type" : "Document",
      "_id" : "id8",
      "_score" : 0.6265886,
      "_source":{"_analyzer":"english","streamId":1,"postDate":"2013-01-30","language":"English","message":"Indians migrated from Asia to North America long time ago. Many years before Columbus.","user":"me"},
      "highlight" : {
        "message" : [ "Indians migrated from <b>Asia</b> to North America long time ago. Many years before Columbus." ]
      }
    } ]
  }
}

which includes the highlight section where the relevant words in the original source text are highlighted by Bold <b> </b> html tags.

But - when we submit the same query to Elastic Hadoop, the EsRDD returned contains only the "_source" section but not the "highlights" for each returned hit.

We hoped that there would be a configurable option to get the whole result including the highlighting field.

Thank you for your help!

Doron