Fuzzy in searchs with asciifolding

Jose_Victor_da_Silva · November 6, 2018, 5:43pm

Good afternoon,
I'm trying to use a "custom analyzer" called test_fuzzy in a fuzzy search, but it does not work when I insert the "preserve_original" option to "true" in the "asciifolding" filter.

When I create a "custom_analyzer" setting the "preserve_original" as false, then the search returns results correctly.

I saw in elastic documentation that fuzziness would be applied in each term (after analysis), Does anyone know the elastic reason not being able to find my documents even though there are more tokens (more options) using "preserve_original" as true?

The following is when preserve_original is active (test_fuzzy):

{
  "tokens": [
    {
      "token": "produto",
      "start_offset": 0,
      "end_offset": 7,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "varzacao",
      "start_offset": 8,
      "end_offset": 16,
      "type": "<ALPHANUM>",
      "position": 1
    },
    {
      "token": "varzação",
      "start_offset": 8,
      "end_offset": 16,
      "type": "<ALPHANUM>",
      "position": 1
    }
  ]
}

The following is when preserve_original is disabled (test_fuzzy):

{
  "tokens": [
    {
      "token": "produto",
      "start_offset": 0,
      "end_offset": 7,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "varzacao",
      "start_offset": 8,
      "end_offset": 16,
      "type": "<ALPHANUM>",
      "position": 1
    }
  ]
}

Here is the query executed:

[
                'match' => [
                    'name.fuzzy' => [
                        'query' => 'produto varzação',
                        'operator' => 'and',
                        'boost' => 2,
                        'zero_terms_query' => 'all',
                        'fuzziness' => 'auto'                        ]
                ]
            ]

Follow the mapping :

'name' =>
                    [
                        'type' => 'text',
                        'analyzer' => 'standard',
                        'fields' => [
                            'norm' => [
                                'type' => 'keyword',
                                'normalizer' => 'keyword_text'
                            ],
                            'stemmed' => [
                                'type' => 'text',
                                'analyzer' => 'stemmed'
                            ],
                            'fuzzy' => [
                                'type' => 'text',
                                'analyzer' => 'test_fuzzy'
                            ]
                        ],
                    ],

Follow the parser and filter:

 'test_fuzzy' => [
        'tokenizer' => 'standard',
        'filter' => [
            'lowercase',
            'custom_asciifolding',
        ]
 ],

 'filter' => [
        'custom_asciifolding' => [
            'type' => 'asciifolding',
            'preserve_original' => true
        ],

system · December 4, 2018, 5:43pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Search with fuzzy and on a field with word_delimiter doesn't work as expected Elasticsearch	2	1307	July 5, 2017
Terms with fuzzy operator don't use mapping analyzer, but always standard analyzer Elasticsearch	6	2249	January 10, 2017
Elasticsearch filter comparison with "preserve_original": true Elasticsearch	1	94	April 26, 2024
Question about asciifolding filter Elasticsearch	3	549	July 6, 2017
Elasticsearch: Handling fuzziness Elasticsearch	1	288	January 17, 2019

Fuzzy in searchs with asciifolding

Related topics