What analyzer does query_string use for highlighting?

Weiwei_Wang · November 30, 2011, 2:14pm

I have mutiple-fields for search, but each field with different
search_analyzer. when do highlighting i found that the fragments is
not as expected.

for example, i have two fields: name, phone, and i have two analyzers
in my elasticsearch.json
"analysis" : {
"analyzer" : {
"nGramAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball","nGramFilter"]
},
"standardAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball"]
}
},
"filter":{
"nGramFilter":{
"type":"nGram",
"min_gram":1,
"max_gram":64
},
"edgeNGramFilter":{
"type":"edgeNGram",
"min_gram":1,
"max_gram":64,
"side":"front"
},
"englishSnowball":{
"type":"snowball",
"language":"English"
}
}

the mapping for the fields are:
"phone":{
"type" : "string",
"index": "analyzed",
"index_analyzer":"nGramAnalyzer",
"search_analyzer":"nGramAnalyzer",
"store":"yes",
"term_vector":"with_positions_offsets"
},
"phone":{
"type" : "string",
"index": "analyzed",
"index_analyzer":"nGramAnalyzer",
"search_analyzer":"standardAnalyzer",
"store":"yes",
"term_vector":"with_positions_offsets"
}

when i do query_string query like below:
curl '10.18.102.101:9201/pim/contact/_search?pretty=true' -d '{"from":
0,"size":2,"query":{"query_string":{"query":"18600","fields":
["name^5.0","phone^5.0"],"default_operator":"or","allow_leading_wildcard":false,"analyze_wildcard":true}},"filter":
{"bool":{"must":{"term":{"deleted":0}}}},"explain":false,"fields":
["name", "phone"],"highlight":{"pre_tags":[""],"post_tags":[""],"fields":{"name":{},"phone":{}}}}'

the highlight is show as:
18600044220

it seems the highlighter uses the nGramAnalyzer for highlighting, but
i expect it use the relevant search_analyzer to do hightlighting for
the field

any one do me a favor for this problem?

elasticsearch version 0.18.4

Goog_Jobs · December 1, 2011, 10:54am

two mappings for the same filed? the lucene demands index_analyzer to
be same with "search_analyzer". 希望有用。

On Nov 30, 10:14 pm, Weiwei Wang ww.wang...@gmail.com wrote:

I have mutiple-fields for search, but each field with different
search_analyzer. when do highlighting i found that the fragments is
not as expected.

for example, i have two fields: name, phone, and i have two analyzers
in my elasticsearch.json
"analysis" : {
"analyzer" : {
"nGramAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball","nGramFilter"]
},
"standardAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball"]
}
},
"filter":{
"nGramFilter":{
"type":"nGram",
"min_gram":1,
"max_gram":64
},
"edgeNGramFilter":{
"type":"edgeNGram",
"min_gram":1,
"max_gram":64,
"side":"front"
},
"englishSnowball":{
"type":"snowball",
"language":"English"
}
}

the mapping for the fields are:
"phone":{
"type" : "string",
"index": "analyzed",
"index_analyzer":"nGramAnalyzer",
"search_analyzer":"nGramAnalyzer",
"store":"yes",
"term_vector":"with_positions_offsets"
},
"phone":{
"type" : "string",
"index": "analyzed",
"index_analyzer":"nGramAnalyzer",
"search_analyzer":"standardAnalyzer",
"store":"yes",
"term_vector":"with_positions_offsets"
}

when i do query_string query like below:
curl '10.18.102.101:9201/pim/contact/_search?pretty=true' -d '{"from":
0,"size":2,"query":{"query_string":{"query":"18600","fields":
["name^5.0","phone^5.0"],"default_operator":"or","allow_leading_wildcard":f alse,"analyze_wildcard":true}},"filter":
{"bool":{"must":{"term":{"deleted":0}}}},"explain":false,"fields":
["name", "phone"],"highlight":{"pre_tags":[""],"post_tags":[""],"fields":{"name":{},"phone":{}}}}'

the highlight is show as:
18600044220

it seems the highlighter uses the nGramAnalyzer for highlighting, but
i expect it use the relevant search_analyzer to do hightlighting for
the field

any one do me a favor for this problem?

elasticsearch version 0.18.4

medcl_net · December 2, 2011, 10:33am

because the the term positions AND offsets are generated and stored during
indexing , not the searching~

-----Original Message-----
From: Weiwei Wang
Sent: Wednesday, November 30, 2011 10:14 PM
To: elasticsearch
Subject: what analyzer does query_string use for highlighting?

I have mutiple-fields for search, but each field with different
search_analyzer. when do highlighting i found that the fragments is
not as expected.

for example, i have two fields: name, phone, and i have two analyzers
in my elasticsearch.json
"analysis" : {
"analyzer" : {
"nGramAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball","nGramFilter"]
},
"standardAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball"]
}
},
"filter":{
"nGramFilter":{
"type":"nGram",
"min_gram":1,
"max_gram":64
},
"edgeNGramFilter":{
"type":"edgeNGram",
"min_gram":1,
"max_gram":64,
"side":"front"
},
"englishSnowball":{
"type":"snowball",
"language":"English"
}
}

the mapping for the fields are:
"phone":{
"type" : "string",
"index": "analyzed",
"index_analyzer":"nGramAnalyzer",
"search_analyzer":"nGramAnalyzer",
"store":"yes",
"term_vector":"with_positions_offsets"
},
"phone":{
"type" : "string",
"index": "analyzed",
"index_analyzer":"nGramAnalyzer",
"search_analyzer":"standardAnalyzer",
"store":"yes",
"term_vector":"with_positions_offsets"
}

when i do query_string query like below:
curl '10.18.102.101:9201/pim/contact/_search?pretty=true' -d '{"from":
0,"size":2,"query":{"query_string":{"query":"18600","fields":
["name^5.0","phone^5.0"],"default_operator":"or","allow_leading_wildcard":false,"analyze_wildcard":true}},"filter":
{"bool":{"must":{"term":{"deleted":0}}}},"explain":false,"fields":
["name", "phone"],"highlight":{"pre_tags":[""],"post_tags":[""],"fields":{"name":{},"phone":{}}}}'

the highlight is show as:
18600044220

it seems the highlighter uses the nGramAnalyzer for highlighting, but
i expect it use the relevant search_analyzer to do hightlighting for
the field

any one do me a favor for this problem?

elasticsearch version 0.18.4

Weiwei_Wang · December 7, 2011, 5:54am

thanks, but when i disable term_vector, everything will be ok

On Dec 2, 6:33 pm, medcl2...@gmail.com wrote:

because the the term positions AND offsets are generated and stored during
indexing , not the searching~

-----Original Message-----
From:WeiweiWang
Sent: Wednesday, November 30, 2011 10:14 PM
To: elasticsearch
Subject: what analyzer does query_string use for highlighting?

I have mutiple-fields for search, but each field with different
search_analyzer. when do highlighting i found that the fragments is
not as expected.

for example, i have two fields: name, phone, and i have two analyzers
in my elasticsearch.json
"analysis" : {
"analyzer" : {
"nGramAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball","nGramFilter"]
},
"standardAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball"]
}
},
"filter":{
"nGramFilter":{
"type":"nGram",
"min_gram":1,
"max_gram":64
},
"edgeNGramFilter":{
"type":"edgeNGram",
"min_gram":1,
"max_gram":64,
"side":"front"
},
"englishSnowball":{
"type":"snowball",
"language":"English"
}
}

the mapping for the fields are:
"phone":{
"type" : "string",
"index": "analyzed",
"index_analyzer":"nGramAnalyzer",
"search_analyzer":"nGramAnalyzer",
"store":"yes",
"term_vector":"with_positions_offsets"
},
"phone":{
"type" : "string",
"index": "analyzed",
"index_analyzer":"nGramAnalyzer",
"search_analyzer":"standardAnalyzer",
"store":"yes",
"term_vector":"with_positions_offsets"
}

when i do query_string query like below:
curl '10.18.102.101:9201/pim/contact/_search?pretty=true' -d '{"from":
0,"size":2,"query":{"query_string":{"query":"18600","fields":
["name^5.0","phone^5.0"],"default_operator":"or","allow_leading_wildcard":f alse,"analyze_wildcard":true}},"filter":
{"bool":{"must":{"term":{"deleted":0}}}},"explain":false,"fields":
["name", "phone"],"highlight":{"pre_tags":[""],"post_tags":[""],"fields":{"name":{},"phone":{}}}}'

the highlight is show as:
18600044220

it seems the highlighter uses the nGramAnalyzer for highlighting, but
i expect it use the relevant search_analyzer to do hightlighting for
the field

any one do me a favor for this problem?

elasticsearch version 0.18.4

Topic		Replies	Views
Problem about index_analyzer and search_analyzer Elasticsearch	1	328	July 6, 2017
Query analzyer with respect to field/index analzyer Elasticsearch	5	346	July 6, 2017
Search_analyzer explanation Elasticsearch	1	252	July 6, 2017
Query_string multi field multi analyzers howto Elasticsearch	1	271	July 6, 2017
Using query_string on 2 fields with different analyzers does not give any result Elasticsearch	1	487	April 18, 2017

What analyzer does query_string use for highlighting?

Related topics