Apply Fuzziness on multiple terms in a not_analyzed field

Hello,

I have a not_analyzed field on which I want to perform terms query by searching for multiple terms (values) on a single field with fuzziness.

For e.g. search ("donald","trump","president") on a not_analyzed field "name" with a fuzziness of 1.

I found that fuzziness cannot be applied on a terms query. Alternative is to use fuzzy query.

But when I provide multiple values to fuzzy query, it returns strange results which do not match with the input at all.
Here's a sample query:

GET indexname/_search
{
  "query" : {
    "nested" : {
    "query": {
        "fuzzy" : {
            "info.name" : {
                "value" : ["donald","trump","president"],
                    "boost" :         1.0,
                    "fuzziness" :     1,
                    "prefix_length" : 0,
                    "max_expansions": 50
            }
        }
    },
    "path" : "info"
    }
}
}

Is there any other way to apply these conditions in a query for multiple terms?

Hello,

Can someone please reply to this?

Can you supply a simple example that illustrates the issue which includes

  1. The document mapping
  2. An example doc
  3. An example query

These are all factors in any issue you have. It helps if they are stripped down to just the fields that relate to the problem.

Cheers
Mark

Hello Mark,

Please find an example below:

  1. Document mapping:

    PUT indexname
    {
    "mappings": {
    "customers": {
    "_all": {
    "enabled": false
    },
    "properties": {
    "info": {
    "type": "nested",
    "properties": {
    "name":
    {
    "type": "string",
    "index": "not_analyzed",
    "doc_values": false,
    "norms": {
    "enabled": false
    }
    },
    "type": {
    "enabled": false
    }
    }
    }
    }
    }
    }
    }

  2. Example doc:

    {
    "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
    {
    "_index": indexname
    "_type": "customers",
    "_id": "101",
    "_source": {
    "info": [
    {
    "type": "V",
    "name": "tramp"
    },
    {
    "type": "a",
    "name": "donaldo"
    },
    {
    "type": "b",
    "name": "presiden"
    }
    ]
    }
    }
    ]
    }
    }

  3. Example query:
    Provided in the above post.

As mentioned, the info.name field is not_analyzed.
While searching, I want to provide multiple values/terms with fuzziness 1.
Ideally, the above query execution should return all the 3 values that I have stored, but it doesn't.

I want to know if there is any other query or mechanism, where we can search for multiple terms together on a not_analyzed field with a fuzziness of 1.

Looking at the API docs and using your examples it seems the value parameter to the FuzzyQuery only works with a single value (i.e. not arrays of values).

It's unclear to me from a quick look at the parsing implementation whether it is expected to work with arrays of values or not (I would suspect given the singular variable name "value" that this is not the case).

Looks like we may have a bug in the parser for not raising an error when an array is passed instead of a single string. I've raised an issue: https://github.com/elastic/elasticsearch/issues/23759

Hello Mark,

Thank you for your quick response.

I tried the above example using match & terms query as well.

  • Terms query accepts an array, but does not accept fuzziness parameter.

  • Match query accepts fuzziness, but it does not accept arrays in the search query.
    For example, the below query is not accepted:

    GET indexname/_search
    {
    "query" : {
    "nested" : {
    "query" : {
    "bool" : {
    "filter" : [
    {
    "match" : {
    "info.name" : {
    "query" : ["donaldo","tramp","presiden"],
    "operator" : "OR",
    "fuzziness" : "1",
    "prefix_length" : 0,
    "max_expansions" : 50,
    "fuzzy_transpositions" : true,
    "lenient" : false,
    "zero_terms_query" : "NONE",
    "boost" : 1.0
    }
    }
    }
    ]
    }
    },
    "path" : "info"
    }
    }
    }

Can you suggest any other way for the above scenario?

You can combine multiple fuzzy queries as separate objects inside a should or must array inside a bool query - depending on whether you want to logically OR or AND these expressions.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.