Fuzziness in span query losing 1 edit distance

Hi,

I am trying to do a fuzzy matching with phrase search. (This is already mentioned here : Combine Phrases search AND fuzzy matching and here : Elasticsearch Fuzzy Phrases - Stack Overflow)
I managed to solve the issue using span queries as mentioned in the posts however I have an issue.

For ex; If I have this document

{
  "name" : "Gracious Knight"
}

a query like this searches correctly

GET [INDEX]/_search
{
	"query": {
		"span_near": {
			"clauses": [
				{
					"span_multi": {
						"match": {
							"fuzzy": {
								"name": {
									"fuzziness": "2",
									"value": "Gracious"
								}
							}
						}
					}
				},
				{
					"span_multi": {
						"match": {
							"fuzzy": {
								"name": {
									"fuzziness": "2",
									"value": "Knight"
								}
							}
						}
					}
				}
			],
			"slop": 0,
			"in_order": "true"
		}
	}
}

However the issue is the fuziness edit distance is losing 1 level. So in the "fuzzy" under first "span_multi", this works;

{
	"name": {
	"fuzziness": "2",
	"value": "Grxcious"
}

Meanwhile this doesn't;

{
	"name": {
	"fuzziness": "2",
	"value": "Grxxious"
}

So when with edit distance of 2, it's ok if 1 character is changed howevre if 2 characters are changed no records are returned. If fuziness is set to 0 even if you put in the exact phrase it doesn't match. So it seems like in this type of query 1 character already seems like changed but I couldn't figure out why?

Just wondering if anybody has any idea about this?

Thanks

Ok, found the answer for this. Lowercase tokenizer is applied to the values when indexing. And in the search queries I am giving the first letter as capital which is what eating up the edit distance. If you give the whole value as lowercase then everything works as expected.

Duh!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.