How can I achieve the consistent behaviour of match and match-phrase-prefix for all types of words with/without special character?


(laxmikant) #1

I have created two apis equal and like which searches filelds from Elastic search:

For example : if "queuename" field in elastic search has values like queue, queue1, queue2 and 3queue

equal api : It returns the result which is exactly equal to the input field value. (Note : I am using boolean match query)
equal api input : queuename = queue

equal api result : 1 document which contains queuename = queue

like api : It returns the result which filed value is equal or superset of the input field value. (Note : I am using match-phrase-prifix query )
like api input : queuename = queue

like api result : documents containing all 4 queuename queue,queue1,queue2 and 3queue.

Note : In my mapping I am using default analyzer ie, standard.

The above functionality of my equal api breaks if i use queueName with special character (example #,$,@,. etc)

For example : if "queuename" field in elastic search has values like queue, queue#1, queue.2 and 3@queue

equal api input : queuename = queue

ACTUAL eqaul api result : documents containing all 4 queuename queue, queue#1, queue.2 and 3@queue

EXPECTED 1 document containing queuename = queue

If I change the analyzer from standard to whitespace then it works for equal api but my like api fails to fetch expected result(it does not search 3queue or 3@queue for queuename = queue )

I also tried with "index": "not_analyzed" but that is also only make my equal api work and fails my like api behaviour.

How can achieve my expected behaviour for both equal and like api for all types of words with/without special character?


(Nik Everett) #2

Try declaring your queuename like:

"queuename": {
    "type": "string",
    "index": "not_analyzed",
    "fields": {
        "sloppy":   { "type": "string", "analyzer": "standard" }
    }
}

In your like API you search for queuename.sloppy instead of queuename. See if that works.

This analyzes the data in two different ways - one for each API. This is a fairly normal thing to do, especially if one API's goal is to only return perfect hits and the other's goal is to return lots and lots of hits that aren't perfect.


(laxmikant) #3

Thanks Nik :slight_smile: ...It worked !!


(system) #4