Strange highlight result, can anyone explain it?


(Ivan Ji) #1

Hi all,

I am using highlight function in ES 1.0.1. I found a very strange situation
as follow:

I want to highlight the field "group.* " which is the inside a
dictionary, whose mapping is

{"properties": {

"group":
{ "type": "object",
"dynamic": false,
"include_in_all": true,
"properties": {
"data": {"type": "string", "index": "analyzed", "analyzer":
"name_analyzer", "term_vector": "with_positions_offsets",
"fielddata":{"format": "disabled"}},
"data_2": {"type": "string", "index": "analyzed", "analyzer":
"nickname_analyzer", "term_vector": "with_positions_offsets",
"fielddata":{"format": "disabled"}},
....(skip)
}
}
}
}

The analyzers of each field inside "group" are all different.

I query the word "Adobe" and I got the following result:

"highlight": {

                "group.data": [
                    "I want it all, and I want it now 106\n\nUsing the 

Adobe ActionScript 3 SDK for Facebook platform 106\nTime",
"– obtaining data in pages 126\nTime for action –
adding limit and offset to GraphRequest instances 128",
"requesting data based on date 131\nTime for
action – adding since and until to GraphRequest instances
133\nTime"
]
}

by using the highlight command:

{'highlight': {'fields': {'group.*': {'fragment_size': 100, 'number_of_fragments':

3}}}}

As you saw, it highlight the "adding" word. I cannot understand why this
comes.
And I am pretty sure the analyzer of "group.data" field cannot normalize
"Adobe" and "adding" into same form.

Any explanations? Please help me to understand what happened.

Thanks.

Ivan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a047e9e5-410a-471d-9b10-cd03b0669197%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Ivan Ji) #2

The query command I used is as

{'multi_match': {'fields': ['_all', 'name', 'group.*'],

'operator': 'and',
'query': 'Adobe'}}

I doubt the problem might be because the various analyzers of the
"group.*".

What is the analyzer to be used during the highlight?

Ivan
Ivan Ji於 2014年8月5日星期二UTC+8下午7時58分42秒寫道:

Hi all,

I am using highlight function in ES 1.0.1. I found a very strange
situation as follow:

I want to highlight the field "group.* " which is the inside a
dictionary, whose mapping is

{"properties": {

"group":
{ "type": "object",
"dynamic": false,
"include_in_all": true,
"properties": {
"data": {"type": "string", "index": "analyzed", "analyzer":
"name_analyzer", "term_vector": "with_positions_offsets",
"fielddata":{"format": "disabled"}},
"data_2": {"type": "string", "index": "analyzed", "analyzer":
"nickname_analyzer", "term_vector": "with_positions_offsets",
"fielddata":{"format": "disabled"}},
....(skip)
}
}
}
}

The analyzers of each field inside "group" are all different.

I query the word "Adobe" and I got the following result:

"highlight": {

                "group.data": [
                    "I want it all, and I want it now 106\n\nUsing 

the Adobe ActionScript 3 SDK for Facebook platform 106\nTime",
"– obtaining data in pages 126\nTime for action –
adding limit and offset to GraphRequest instances 128",
"requesting data based on date 131\nTime for
action – adding since and until to GraphRequest instances
133\nTime"
]
}

by using the highlight command:

{'highlight': {'fields': {'group.*': {'fragment_size': 100, 'number_of_fragments':

3}}}}

As you saw, it highlight the "adding" word. I cannot understand why this
comes.
And I am pretty sure the analyzer of "group.data" field cannot normalize
"Adobe" and "adding" into same form.

Any explanations? Please help me to understand what happened.

Thanks.

Ivan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/60fd5110-56bc-46f0-b767-e8cfc70bb7e8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Ivan Ji) #3

For people that face the same problems, it's because the analyzers used in
highlight is included all the ones of the fields in the query if you don't
turn require_field_match to true.

FYI.

Ivan Ji於 2014年8月5日星期二UTC+8下午8時08分31秒寫道:

The query command I used is as

{'multi_match': {'fields': ['_all', 'name', 'group.*'],

'operator': 'and',
'query': 'Adobe'}}

I doubt the problem might be because the various analyzers of the
"group.*".

What is the analyzer to be used during the highlight?

Ivan
Ivan Ji於 2014年8月5日星期二UTC+8下午7時58分42秒寫道:

Hi all,

I am using highlight function in ES 1.0.1. I found a very strange
situation as follow:

I want to highlight the field "group.* " which is the inside a
dictionary, whose mapping is

{"properties": {

"group":
{ "type": "object",
"dynamic": false,
"include_in_all": true,
"properties": {
"data": {"type": "string", "index": "analyzed", "analyzer":
"name_analyzer", "term_vector": "with_positions_offsets",
"fielddata":{"format": "disabled"}},
"data_2": {"type": "string", "index": "analyzed", "analyzer":
"nickname_analyzer", "term_vector": "with_positions_offsets",
"fielddata":{"format": "disabled"}},
....(skip)
}
}
}
}

The analyzers of each field inside "group" are all different.

I query the word "Adobe" and I got the following result:

"highlight": {

                "group.data": [
                    "I want it all, and I want it now 106\n\nUsing 

the Adobe ActionScript 3 SDK for Facebook platform 106\nTime",
"– obtaining data in pages 126\nTime for action
adding limit and offset to GraphRequest instances 128",
"requesting data based on date 131\nTime for
action – adding since and until to GraphRequest instances
133\nTime"
]
}

by using the highlight command:

{'highlight': {'fields': {'group.*': {'fragment_size': 100, 'number_of_fragments':

3}}}}

As you saw, it highlight the "adding" word. I cannot understand why this
comes.
And I am pretty sure the analyzer of "group.data" field cannot normalize
"Adobe" and "adding" into same form.

Any explanations? Please help me to understand what happened.

Thanks.

Ivan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5e53c738-8b1c-45be-9573-4ab7e6447287%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #4