Hacksign
(Hacksign)
December 13, 2016, 5:05am
1
If there is a data in ES like this :
{'realname':'XZY'}
note : X/Z/Y are CJK charcters, NOT English letters.
If I want pick above item out , I wrote dsl below :
{
"size" : 10,
"query" : {
"wildcard" : {
"realname" : "X*"
}
}
}
this works fine, but If DSL is like this :
{
"size" : 10,
"query" : {
"wildcard" : {
"realname" : "X*Y"
}
}
}
this can not find anything.
anything wrong ? or I misunderstank something from this document ?
dadoonet
(David Pilato)
December 13, 2016, 6:02am
2
Try the _analyze API to see how your document is actually indexed.
Then remember that wildcard string is not analyzed so it's compared to the previous output.
Finally: don't use wildcards!
Hacksign
(Hacksign)
December 13, 2016, 8:32am
3
thanks for reply.
this is the output of _analyze :
XZY are still CJK characters ...
[root@host ~]# curl http://localhost:9200/dbs/_analyze?pretty -d '{"field":"some_field", "text":"XZY"}'
{
"tokens" : [
{
"token" : "X",
"start_offset" : 0,
"end_offset" : 1,
"type" : "<IDEOGRAPHIC>",
"position" : 0
},
{
"token" : "Z",
"start_offset" : 1,
"end_offset" : 2,
"type" : "<IDEOGRAPHIC>",
"position" : 1
},
{
"token" : "Y",
"start_offset" : 2,
"end_offset" : 3,
"type" : "<IDEOGRAPHIC>",
"position" : 2
}
]
}
the problem confuse me is ,
under Elasticsearch 2.3, wildcard search like this:
"size" : 10,
"query" : {
"wildcard" : {
"realname" : "X*Y"
}
}
}
will return results.
but after upgrade es to 5.0, only querys below could return results :
{
"size" : 10,
"query" : {
"wildcard" : {
"realname" : "X*"
}
}
}
if this is a problem relative to mapping and participle, why "X*" could hit results while "X*Y" could not ?
dadoonet
(David Pilato)
December 13, 2016, 8:54am
4
I don't know how it worked previously in 2.x series.
May be the analyzer you were using was producing [ "XYZ" ]
instead of [ "X", "Y", "Z" ]
?
Hacksign
(Hacksign)
December 16, 2016, 2:44am
5
As _analyze api returned.
CJK character is analyzed as ['X', 'Z', 'Y'], not ['XZY'].
this seems to be the default analyzer behaviour(split CJK characters into single word one after another).
So, still confused of understanding why can not get correct result by providing 'X*Y' to wildcard query.
dadoonet
(David Pilato)
December 17, 2016, 8:21am
6
So if you have in the inverted index:
X*Y
won't match any on those, right?
X*
, Y*
, Z*
will.
system
(system)
Closed
January 14, 2017, 8:21am
7
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.