Before creating a new issue report I wanted to ask here if someone can
please confirm the following situation.
Searching for something with a German umlaut e.g. "Körbe" and using the "*"
wildcard results in zero hits. This is true for 0.20.5 as well as 0.90RC1
and RC2. The index has to be created without storing the source. If the
source is stored, the possible bug seems not to be triggered.
curl -XGET 'http://localhost:9200/twitter/tweet/_search' -d '{
"query" : {
"query_string" : {
"default_field" : "message",
"query" : "körb*"
}
}
}'
Setting "analyze_wildcard" bring up results but they look still incomplete.
curl -XGET 'http://localhost:9200/twitter/tweet/_search' -d '{
"query" : {
"query_string" : {
"default_field" : "message",
"analyze_wildcard" : true,
"query" : "körb*"
}
}
}'
The index has been built with a per-field analyzer, mostly "german" or a
custom one without stopwords. A simple one could look like:
curl -XPOST 'http://localhost:9200/twitter' -d '{
"mappings" : {
"tweet" : {
"_source" : { "enabled" : false },
"properties" : { "message" : {"type" : "string", "analyzer": "german"
} }
}
}
}'
curl -XPOST 'http://localhost:9200/twitter/tweet/' -d '{
"message" : "We are testing with German umlauts. Körbe is a great
example."
}'
curl -XPOST 'http://localhost:9200/twitter/tweet/' -d '{
"message" : "We are still testing with German umlauts. Körbe Made in
Germany are available for worldwide delivery."
}'
curl -XPOST 'http://localhost:9200/twitter/tweet/' -d '{
"message" : "Here is a third example still with a German umlaut (ä)."
}'
Am I missing something? Can someone confirm it?
Thanks in advance
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.