Using must and falling to fuzzy if no match in queries


(Eric Ohtake) #1

Hi,

I'm trying to match some documents in a query where my dataset contains exact matches and others that don't exactly match, if ever. I'd like to query ES for the exact match, and if it doesn't exists exactly, it falls to the "should" so I don't get empty handed if the exact match is not present.

I just need the first hit of the dataset.

My original data is in Japanese, so I will try to reproduce some of them here in Roman chars:

Data sample:

> Prefecture  Municipality     Street
> Tokyo          Shinaga wa     ABCDEFHI
> Tokyo          Shina gawa     EFGHJK
> Tokyo          Shinagawa     ABC DEF
> Tokyo          Shinagawa     AB CDEF
> Tokyo          Shinaga wa     ABCDEF

Query:

{
  "query":{
     "bool":{
        "must":[
           { "match": { "prefecture.keyword":   "prefecture" } },
           { "fuzzy": { "municipality.keyword": { "value": "shinagawa"} } }
        ],
        "should":[
           { "fuzzy": { "street": { "value": "ABCDEF"} } }
        ]
     }
  },
  "from":0, "size" : 1,
  "_source": ["prefecture", "municipality", "street"]
}

Results:

> Prefecture  Municipality     Street
> Tokyo          Shinaga wa     ABCDEFHI
> Tokyo          Shina gawa     EFGHJK
> Tokyo          Shinagawa     ABC DEF
> Tokyo          Shinagawa     AB CDEF

Expected results:

> Prefecture  Municipality     Street
> Tokyo          Shinagawa     ABCDEF
> Tokyo          Shinagawa     ABC DEF
> Tokyo          Shinagawa     AB CDEF

I end up getting everything, while I want only what matches, exactly or via fuzzy. How can I do this query properly?


(Hubo3085632) #2

hi , u need print the explain to see the detail scores,even though u used fuzzy query it is not the important part in scoring algorithm.
here is the algorithm:
https://www.elastic.co/guide/en/elasticsearch/guide/current/practical-scoring-function.html#query-norm


(Eric Ohtake) #3

Thanks.
I couldn't accomplish in that way, so I asked for a more normalized data, like a municipality field without that crazy spaces, and made a term match on the first two fields. Then used should/fuzzy on the street. It brings back the things I need.


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.