Highlight when query contains fields and text - ES 1.6.0


(Effi) #1

Hello,

I'm using elasticsearch 1.6.0.
Assume having the following document indexed in elasticsearch:

{
    "id" : 2,
    "user" : "jack",
    "message" : "I'm jack",
	"comment" : "my first tweet"
}

When query:

{
  "from" : 0,
  "size" : 5,
  "query" : {
    "query_string" : {
      "query" : "user:jack first",
      "use_dis_max" : true
    }
  },
  "highlight" : {
    "pre_tags" : [ "<span class=\"mark\">" ],
    "post_tags" : [ "</span>" ],
    "order" : "score",
    "encoder" : "html",
    "require_field_match" : false,
    "fields" : {    
      "*" : {}
    }
  }
}

I'm using "query string query".
As you can see, my query combines field match (user:jack) and text (first).
As a result, elasticsearch will highlight the word jack in both user field and message.
But I query for user:jack, mean I want to highlight only user name jack and not jack in other fields.
I can't set require_field_match to true, because than elasticsearch will not highlight 'first'.

What should I do?

Thanks in advance,
Effi


(Nik Everett) #2

The simplest thing is to set require field match to true and to set the
default fields parameter on the query_string to "message, comment".


(Effi) #3

Unfortunately it wouldn't work :frowning:
This will not highlight the word 'first' (like I wrote in the question:
I can't set require_field_match to true, because than elasticsearch will not highlight 'first').


(Nik Everett) #4

Sure it will - but only if you explicitly search that field instead of the
_all field. Do query_string: {fields: "comment, message", query: "your
query"} and it should work. I've totally done it before but I have to admit
I haven't tested it this morning.


(Effi) #5

Wooow, it's working. THANKS! :smiley:

I don't understand why should I determine that I want highlight on text in the search level (and not at highlight section)?

Another question, let's say my document looks like this:

{ 
  "type": "tweet",
  "fields": {
         "user" : "jack",
         "message" : "I'm jack",
         "comment" : "my first tweet"
         }
}

Can I search on fields.* instead of specific fields, like: {fields: "comment, message", query: "your query"}?
I understand that on _all it's not working, but I see in elasticsearch documentation that I can use wildcard, they use the follow example:

{
    "query_string" : {
        "fields" : ["city.*"],
        "query" : "this AND that OR thus",
        "use_dis_max" : true
    }
}

For me it failed with SearchPhaseExecutionException :worried:
Any suggestion?

Thanks,
Effi


Query String Query Performance Issue when using ~ operator - ES 1.6
(Effi) #6

Another issue is when adding to query-string fields a field from type that is not string, for example a field name times from type long:

{
  "from" : 0,
  "size" : 5,
  "query" : {
    "query_string" : {
     "fields" : ["user","message","comment","times"],
      "query" : "first times:>1",
      "use_dis_max" : true
    }
  },
  "highlight" : {
    "pre_tags" : [ "<span class=\"mark\">" ],
    "post_tags" : [ "</span>" ],
    "order" : "score",
    "encoder" : "html",
    "require_field_match" : false,
    "fields" : {    
      "*" : {}
    }
  }
}

On example data, that looks like this:

{ 
  "type": "tweet",
  "fields": {
         "user" : "jack",
         "message" : "I'm jack",
         "comment" : "my first tweet",
         "times" : "7"
         }
}

This will raise a NumberFormatException.
I understand from elasticsearch doc on "Multi Field", that the idea of running the query_string query against multiple fields is to expand each query term to an OR clause like this:
field1:query_term OR field2:query_term | ...

In this example, times:first will failed, because times configured as long.
Is that mean that I can't highlight fields from type long (other than string)?

Thanks,
Effi


(Nik Everett) #7

For the most part that is true, yes. Some of the highlighters make an effort to stringify the non-string data first and then do string highlighting against it but that is lame and haphazard. Its not something that's particularly well thought out frankly.


(system) #8