Ok, so after fiddling around a bit I seem to have found the reason for this
behaviour:
curl -XGET
'http://127.0.0.1:9200/sanctionlists/_analyze?pretty=1&text=gmbh&analyzer=name_analyzer'
results in:
{
"tokens" : [ ]
}
whereas
curl -XGET
'http://127.0.0.1:9200/sanctionlists/_analyze?pretty=1&text=Gmbh&analyzer=name_analyzer'
results in:
{
"tokens" : [ {
"token" : "g",
"start_offset" : 0,
"end_offset" : 1,
"type" : "word",
"position" : 1
}, {
"token" : "m",
"start_offset" : 1,
"end_offset" : 2,
"type" : "word",
"position" : 2
}, {
"token" : "b",
"start_offset" : 2,
"end_offset" : 3,
"type" : "word",
"position" : 3
}, {
"token" : "h",
"start_offset" : 3,
"end_offset" : 4,
"type" : "word",
"position" : 4
}, {
"token" : "gm",
"start_offset" : 0,
"end_offset" : 2,
"type" : "word",
"position" : 5
}, {
"token" : "mb",
"start_offset" : 1,
"end_offset" : 3,
"type" : "word",
"position" : 6
}, {
"token" : "bh",
"start_offset" : 2,
"end_offset" : 4,
"type" : "word",
"position" : 7
}, {
"token" : "gmb",
"start_offset" : 0,
"end_offset" : 3,
"type" : "word",
"position" : 8
}, {
"token" : "mbh",
"start_offset" : 1,
"end_offset" : 4,
"type" : "word",
"position" : 9
}, {
"token" : "gmbh",
"start_offset" : 0,
"end_offset" : 4,
"type" : "word",
"position" : 10
} ]
So the analyzer seems not to ignore the case of the stopword even if
ignore_case is set to true. So I set a lowercase filter in FRONT of the
stopword filter and this did the trick for the examples above. The next
thing was that the fuzzy_like_this query does not analyze a query
(splitting it into tokens) as it seems. So I tried a match query on the
names field with the proper fuzziness (0.5), setting the analyzer to
"name_analyzer" and this actually worked!
I am not sure wheter this is a bug or intended when using FLT queries. Any
thoughts?
Thanks,
Hannes
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.