Search query doesn't use custom analyzer


(Nikolay Eryomin) #1

I have a problem. Halp me please.
See the queries below

created new index with custom analyzer

PUT test
{
    "mappings": {
        "test": {
            "_all": {
               "analyzer": "testAnalyzer"
            },
            "properties": {
                "name": {
                  "type": "string",
                  "analyzer": "testAnalyzer"
                }
            }
        }
    }, 
    "settings": {
        "analysis": {
            "filter": {
                "testsyn": {
                     "type": "synonym",
                     "synonyms": [
                        "test test => test_test"
                     ]
                }
            },
            "analyzer": {
                "testAnalyzer": {
                     "filter": [
                        "lowercase",
                        "testsyn"
                     ],
                     "type": "custom",
                     "tokenizer": "standard"
                }
            }
        }
    }
}

Add data to index

PUT test/test/1
{
    "name": "test"
}

PUT test/test/2
{
    "name": "test test"
}

Check data and custom analyzer

POST test/test/_search
{
    "query": {
        "match_all": {}
    }
}

GET test/_analyze?analyzer=testAnalyzer&text=test test

Check query_string, we can view that custom analyzer not use

POST test/test/_search
{
    "explain": true, 
    "query": {
        "query_string": {
           "default_field": "name",
           "query": "test test"
        }
    }
}

Why? What can I do to fix this?
Thanks


Multi words synonyms and query syntax
(Patrick Kik) #2

The standard tokenizer will tokenize the input string "test test" to two tokens "test" and "test". Neither of those tokens will be handled by the testsyn filter.
I think a quick test could be if you use the not_analyzed tokenizer.


(Nikolay Eryomin) #3

I am sorry. query_string really use custom analyzer, but doesn't apply synonym with 2 words.
If we add synonym with 1 words, we can see that synonym used.

if we use "match", we see that 2 words synonym works true.
But why they don't work with "query_string"?

POST test/test/_search
{
    "explain": true, 
    "query": {
        "match": {
           "name": "test test"
        }
    }
}

I think a quick test could be if you use the not_analyzed tokenizer.

What is it "not_analyzed tokenizer"?


(Patrick Kik) #4

Didn't know it by heart, had to look it up: I mean the Keyword Tokenizer.

Your answer may be in here: https://www.elastic.co/guide/en/elasticsearch/guide/current/multi-word-synonyms.html#_synonyms_and_the_query_string_query


(Nikolay Eryomin) #5

Thanks!!! That explains a lot.
But how can I use query syntax (OR, AND, NOT etc.) with multi word synonyms?
Of course, I can split request before sent to ES and use bool query (for example). But I think it is tipical task, and ES can do it :slight_smile:


(system) #6