Hi,
I'm getting strange results trying to search on an index with 2 types using
a keyword tokenizer.
Using ElasticSearch 0.90.2 this :
curl -XGET localhost:9200/myindex/_search?pretty=1 -d '{"query":{"match":{"name":"carex
f"}}}'
Returns a result containing "Carex" alone (unexpected behaviour)
curl -XGET localhost:9200/myindex/taxon/_search?pretty=1 -d '{"query":{"match":{"name":"carex
f"}}}'
Will return an expected results of "Carex feta" and not "Carex" alone.
If I do the same thing using ElasticSearch 0.90.1, the 2 queries above will
return the expected results. This could be related to different
configuration but I am using the default configurations on both versions.
So, I would like to know what is the ElasticSearch expected behavior for
the first query?
Could it be related to ES using a default tokenizer (Standard) when we use
multiple types?
Here are the current settings:
curl -XPOST "localhost:9200/myindex" -d '
{
"settings":{
"index":{
"analysis":{
"filter" : {
"name_nGram" : {
"max_gram" : 100,
"min_gram" : 2,
"type" : "edge_ngram"
}
},
"analyzer":{
"name_index" : {
"filter" : [
"lowercase","asciifolding","name_nGram"
],
"tokenizer" : "standard"
},
"full_name_index" : {
"filter" : [
"lowercase","asciifolding"
],
"tokenizer" : "keyword"
},
"scientificname_index" : {
"filter" : [
"lowercase","asciifolding","name_nGram"
],
"tokenizer" : "keyword"
},
"name_search" : {
"filter" : [
"lowercase","asciifolding"
],
"tokenizer" : "keyword"
}
}
}
}
},
"mappings" : {
"taxon" : {
"properties" : {
"name" : {
"type" : "multi_field",
"fields":{
"name":{
"type" : "string",
"index_analyzer" : "full_name_index",
"search_analyzer" : "name_search"
},
"ngrams":{
"type" : "string",
"index_analyzer" : "scientificname_index",
"search_analyzer" : "name_search"
}
}
},
"status":{
"index" : "not_analyzed",
"type" : "string"
},
"namehtml":{
"index" : "not_analyzed",
"type" : "string"
},
"namehtmlauthor":{
"index" : "not_analyzed",
"type" : "string"
},
"rankname":{
"index" : "not_analyzed",
"type" : "string"
},
"parentid":{
"index" : "not_analyzed",
"type" : "integer"
},
"parentnamehtml":{
"index" : "not_analyzed",
"type" : "string"
}
}
},
"vernacular" : {
"properties" : {
"name" : {
"type" : "multi_field",
"fields":{
"name":{
"type" : "string",
"index" : "not_analyzed"
},
"ngrams":{
"type" : "string",
"search_analyzer" : "name_search",
"index_analyzer" : "name_index"
}
}
},
"taxonid":{
"index" : "not_analyzed",
"type" : "integer"
},
"status":{
"index" : "not_analyzed",
"type" : "string"
},
"language":{
"index" : "not_analyzed",
"type" : "string"
},
"taxonnamehtml":{
"index" : "not_analyzed",
"type" : "string"
}
}
}
}
}'
Add some data:
curl -XPUT 'http://localhost:9200/myindex/taxon/1' -d '{
"name" : "carex"
}'
curl -XPUT 'http://localhost:9200/myindex/taxon/2' -d '{
"name" : "carex feta"
}'
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.