edgeNGram weirdness


(Axsuul) #1

Hi,

I'm having trouble getting a edgengram query to behave properly. I have one
record "blue grass" with an edgengram minimum of 2. A query string of "blv"
however returns "blue grass" although it shouldn't.

curl -X POST http://localhost:9200/test -d '{
"mappings": {
"product/fragrance": {
"properties": {
"name_query": {
"index_analyzer": "query_index_analyzer",
"search_anaylzer": "query_search_analyzer",
"as": {},
"type": "string"
}
}
}
},
"settings": {
"analysis": {
"filter": {
"query_edgengram": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 20,
"side": "front"
}
},
"analyzer": {
"query_index_analyzer": {
"tokenizer": "lowercase",
"filter": ["asciifolding", "query_edgengram"]
},
"query_search_analyzer": {
"tokenizer": "lowercase",
"filter": ["asciifolding"]
}
}
}
}
}'

curl -X POST "http://localhost:9200/test/product%2Ffragrance/1" -d '{
"name_query": "blue grass"
}'

curl -X GET "
http://localhost:9200/test/product%2Ffragrance/_search?load=true&pretty=true"
-d '{
"query": {
"bool": {
"must": [{
"query_string": {
"query": "blv",
"fields": ["name_query"],
"default_operator": "OR"
}
}]
}
}
}'

For some reason, I get a result from that. Can anyone explain why? Thanks.
What I want to happen is "blv" shouldn't be returning "blue grass" although
"bl" should. I've used the analyze API and see "blue grass" being broken
down to "bl", "blu", "blue", "gr", "gra", "gras", "grass" but "blv" doesn't
match any of those.

--


(system) #2