Hi,
I'm trying to match some documents in a query where my dataset contains exact matches and others that don't exactly match, if ever. I'd like to query ES for the exact match, and if it doesn't exists exactly, it falls to the "should" so I don't get empty handed if the exact match is not present.
I just need the first hit of the dataset.
My original data is in Japanese, so I will try to reproduce some of them here in Roman chars:
Data sample:
> Prefecture Municipality Street
> Tokyo Shinaga wa ABCDEFHI
> Tokyo Shina gawa EFGHJK
> Tokyo Shinagawa ABC DEF
> Tokyo Shinagawa AB CDEF
> Tokyo Shinaga wa ABCDEF
Query:
{
"query":{
"bool":{
"must":[
{ "match": { "prefecture.keyword": "prefecture" } },
{ "fuzzy": { "municipality.keyword": { "value": "shinagawa"} } }
],
"should":[
{ "fuzzy": { "street": { "value": "ABCDEF"} } }
]
}
},
"from":0, "size" : 1,
"_source": ["prefecture", "municipality", "street"]
}
Results:
> Prefecture Municipality Street
> Tokyo Shinaga wa ABCDEFHI
> Tokyo Shina gawa EFGHJK
> Tokyo Shinagawa ABC DEF
> Tokyo Shinagawa AB CDEF
Expected results:
> Prefecture Municipality Street
> Tokyo Shinagawa ABCDEF
> Tokyo Shinagawa ABC DEF
> Tokyo Shinagawa AB CDEF
I end up getting everything, while I want only what matches, exactly or via fuzzy. How can I do this query properly?