Highlighting the wrong value


(Oleg Ievtushok) #1

Hi everyone, I have the following scenario, where the wrong value is
highlighted:

#!/bin/bash
curl -XDELETE 'http://localhost:9200/highlight'
curl -XPOST 'http://localhost:9200/highlight'
curl -XPUT -d '{
"doc" : {
"_source" : { "enabled" : false },
"_all" : { "enabled" : true },
"_timestamp" : { "enabled" : true, "store" : "yes" },
"properties" : {
"text1" : {
"country" : "string",
"store" : "yes"
},
"file" : {
"type" : "string",
"store" : "yes"
}
}
}
}' http://localhost:9200/highlight/doc/_mapping
curl -XPOST -d '{ "country": "US and Canada", "file": "some keyword here
and there" }' http://localhost:9200/highlight/doc
curl -XPOST -d '{ "country": "US and Italy", "file": "you should not
highlight Canada" }' http://localhost:9200/highlight/doc
curl -XPOST -d '{ "country": "Brazil and France", "file": "whatever" }'
http://localhost:9200/highlight/doc
curl -XPOST http://localhost:9200/highlight/_refresh
curl -XGET http://localhost:9200/highlight/_search?pretty=true -d '{
"query" : {
"bool" : {
"must" : [ {
"field" : {
"country" : "US Canada"
}
}, {
"field" : {
"file" : "keyword highlight"
}
} ]
}
},
"highlight" : {
"fields" : {
"file" : {
}
}
}
}'

I was not expecting to get the "Canada" highlighted in the search result:
"highlight" : {
"file" : [ "you should not highlight Canada" ]
}

Is there any way to highlight only "highlight" word in this case?

--


(David Pilato) #2

IMHO, highlighting process is applied at the end of the search process.
That said, I suppose that highlight check for terms in file field. Terms is the full list of analyzed (or not) used in the query. (that's only a supposition as I did not look at the highlight code right now)

I'm curious of Elasticsearch gurus answers about this. Is it doable? Should Oleg open an issue for it?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 23 août 2012 à 17:02, Oleg Ievtushok yevtushok@gmail.com a écrit :

Hi everyone, I have the following scenario, where the wrong value is highlighted:

#!/bin/bash
curl -XDELETE 'http://localhost:9200/highlight'
curl -XPOST 'http://localhost:9200/highlight'
curl -XPUT -d '{
"doc" : {
"_source" : { "enabled" : false },
"_all" : { "enabled" : true },
"_timestamp" : { "enabled" : true, "store" : "yes" },
"properties" : {
"text1" : {
"country" : "string",
"store" : "yes"
},
"file" : {
"type" : "string",
"store" : "yes"
}
}
}
}' http://localhost:9200/highlight/doc/_mapping
curl -XPOST -d '{ "country": "US and Canada", "file": "some keyword here and there" }' http://localhost:9200/highlight/doc
curl -XPOST -d '{ "country": "US and Italy", "file": "you should not highlight Canada" }' http://localhost:9200/highlight/doc
curl -XPOST -d '{ "country": "Brazil and France", "file": "whatever" }' http://localhost:9200/highlight/doc
curl -XPOST http://localhost:9200/highlight/_refresh
curl -XGET http://localhost:9200/highlight/_search?pretty=true -d '{
"query" : {
"bool" : {
"must" : [ {
"field" : {
"country" : "US Canada"
}
}, {
"field" : {
"file" : "keyword highlight"
}
} ]
}
},
"highlight" : {
"fields" : {
"file" : {
}
}
}
}'

I was not expecting to get the "Canada" highlighted in the search result:
"highlight" : {
"file" : [ "you should not highlight Canada" ]
}

Is there any way to highlight only "highlight" word in this case?

--


(phill) #3

http://www.elasticsearch.org/guide/reference/api/search/highlighting.html

"|require_field_match| can be set to |true| which will cause a field to
be highlighted only if a query matched that field. |false| means that
terms are highlighted on all requested fields regardless if the query
matches specifically on them."

On 8/23/2012 10:53 AM, David Pilato wrote:

IMHO, highlighting process is applied at the end of the search process.
That said, I suppose that highlight check for terms in file field.
Terms is the full list of analyzed (or not) used in the query. (that's
only a supposition as I did not look at the highlight code right now)

I'm curious of Elasticsearch gurus answers about this. Is it doable?
Should Oleg open an issue for it?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 23 août 2012 à 17:02, Oleg Ievtushok <yevtushok@gmail.com
mailto:yevtushok@gmail.com> a écrit :

Hi everyone, I have the following scenario, where the wrong value is
highlighted:

#!/bin/bash
curl -XDELETE 'http://localhost:9200/highlight'
http://localhost:9200/highlight'
curl -XPOST 'http://localhost:9200/highlight'
http://localhost:9200/highlight'
curl -XPUT -d '{
"doc" : {
"_source" : { "enabled" : false },
"_all" : { "enabled" : true },
"_timestamp" : { "enabled" : true, "store" : "yes" },
"properties" : {
"text1" : {
"country" : "string",
"store" : "yes"
},
"file" : {
"type" : "string",
"store" : "yes"
}
}
}
}' http://localhost:9200/highlight/doc/_mapping
curl -XPOST -d '{ "country": "US and Canada", "file": "some keyword
here and there" }' http://localhost:9200/highlight/doc
curl -XPOST -d '{ "country": "US and Italy", "file": "you should not
highlight Canada" }' http://localhost:9200/highlight/doc
curl -XPOST -d '{ "country": "Brazil and France", "file": "whatever"
}' http://localhost:9200/highlight/doc
curl -XPOST http://localhost:9200/highlight/_refresh
curl -XGET http://localhost:9200/highlight/_search?pretty=true -d '{
"query" : {
"bool" : {
"must" : [ {
"field" : {
"country" : "US Canada"
}
}, {
"field" : {
"file" : "keyword highlight"
}
} ]
}
},
"highlight" : {
"fields" : {
"file" : {
}
}
}
}'

I was not expecting to get the "Canada" highlighted in the search result:
"highlight" : {
"file" : [ "you should not highlight Canada" ]
}

Is there any way to highlight only "highlight" word in this case?

--


(David Pilato) #4

Thanks! Good to know!

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 août 2012 à 19:12, "P. Hill" parehill1@gmail.com a écrit :

http://www.elasticsearch.org/guide/reference/api/search/highlighting.html

"|require_field_match| can be set to |true| which will cause a field to be highlighted only if a query matched that field. |false| means that terms are highlighted on all requested fields regardless if the query matches specifically on them."

On 8/23/2012 10:53 AM, David Pilato wrote:

IMHO, highlighting process is applied at the end of the search process.
That said, I suppose that highlight check for terms in file field. Terms is the full list of analyzed (or not) used in the query. (that's only a supposition as I did not look at the highlight code right now)

I'm curious of Elasticsearch gurus answers about this. Is it doable? Should Oleg open an issue for it?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 23 août 2012 à 17:02, Oleg Ievtushok <yevtushok@gmail.com mailto:yevtushok@gmail.com> a écrit :

Hi everyone, I have the following scenario, where the wrong value is highlighted:

#!/bin/bash
curl -XDELETE 'http://localhost:9200/highlight' http://localhost:9200/highlight'
curl -XPOST 'http://localhost:9200/highlight' http://localhost:9200/highlight'
curl -XPUT -d '{
"doc" : {
"_source" : { "enabled" : false },
"_all" : { "enabled" : true },
"_timestamp" : { "enabled" : true, "store" : "yes" },
"properties" : {
"text1" : {
"country" : "string",
"store" : "yes"
},
"file" : {
"type" : "string",
"store" : "yes"
}
}
}
}' http://localhost:9200/highlight/doc/_mapping
curl -XPOST -d '{ "country": "US and Canada", "file": "some keyword here and there" }' http://localhost:9200/highlight/doc
curl -XPOST -d '{ "country": "US and Italy", "file": "you should not highlight Canada" }' http://localhost:9200/highlight/doc
curl -XPOST -d '{ "country": "Brazil and France", "file": "whatever" }' http://localhost:9200/highlight/doc
curl -XPOST http://localhost:9200/highlight/_refresh
curl -XGET http://localhost:9200/highlight/_search?pretty=true -d '{
"query" : {
"bool" : {
"must" : [ {
"field" : {
"country" : "US Canada"
}
}, {
"field" : {
"file" : "keyword highlight"
}
} ]
}
},
"highlight" : {
"fields" : {
"file" : {
}
}
}
}'

I was not expecting to get the "Canada" highlighted in the search result:
"highlight" : {
"file" : [ "you should not highlight Canada" ]
}

Is there any way to highlight only "highlight" word in this case?

--

--


(phill) #5

But don't put me in the league of ES gurus, please. I recalled seeing
that info when I was reading about ES and recognized that as something
that triggers the behavior down in the Lucene HitHighlighter(s) which I
have had a love/hate relationship with over the last year when working
directly with Lucene.

-Paul

On 8/29/2012 10:33 AM, David Pilato wrote:

Thanks! Good to know!

--


(system) #6