I have a search set up to do fuzzy matching on the tokens that the standard
analyzer returns.
It all works well on single words, and we get accurate results, say, on a
title field like 'Cherry Cheesecake'. If we search for either 'Cherry', or
'Cheesecake' we get a result.
If we search for 'Cherry Cheesecake' we get 0 results.
I'm really at a loss as to how this can be happening, as I would have
thought it matches closer than either separate word.
The index is as follows:
{
"analysis": {
"analyzer": {
"custom_fulltext" : {
"type": "custom",
"tokenizer" : "standard",
"filter": ["stop", "asciifolding", "snowball", "lowercase",
"custom_synonyms", "custom_stop"]
}
},
"filter" : {
"custom_synonyms": {
"type": "synonym",
"ignore_case": "true",
"synonyms": [
"i-pod, i pod => ipod",
"definately, definitly, definetly => definitely"
]
},
"custom_stop": {
"type": "stop",
"stopwords": ["a", "an", "and", "are", "as", "at", "be", "but", "by",
"into", "is", "it", "of", "on", "or", "such", "that", "the", "their",
"there", "these", "they", "this", "to", "was", "will"]
}
}
}
}
The mapping is as such:
{
"recipes": {
"properties": {
"title": {
"type": "string",
"index": "analyzed",
"analyzer": "nigella_fulltext"
},
"url": {
"type": "string",
"index": "no",
"include_in_all": false
},
"introduction": {
"type": "string",
"index": "no",
"include_in_all": false
},
"ingredients": {
"type": "string",
"index": "analyzed"
},
"moods": {
"type": "string",
"index": "not_analyzed",
"include_in_all": false
},
"occasions": {
"type": "string",
"index": "not_analyzed",
"include_in_all": false
},
"is_vegetarian": {
"type": "string",
"store": "yes",
"index": "not_analyzed",
"null_value": "no"
},
"is_gluten_free": {
"type": "string",
"store": "yes",
"index": "not_analyzed",
"null_value": "no"
},
"is_express": {
"type": "string",
"store": "yes",
"index": "not_analyzed",
"null_value": "no"
},
"is_baking": {
"type": "string",
"store": "yes",
"index": "not_analyzed",
"null_value": "no"
}
"is_premium": {
"type": "string",
"store": "yes",
"index": "not_analyzed",
"null_value": "no"
},
"ordering_field": {
"type": "integer",
"store": "yes",
"index": "not_analyzed",
"null_value": "0"
}
}
}
}
And the search is as follows:
{
"explain": true,
"sort": [{
"ordering_field": "desc"
}, "_score"],
"size": 15,
"from": 0,
"query": {
"bool": {
"should": [{
"bool": {
"should": [{
"fuzzy": {
"recipes.title": {
"value": "cherry cheesecake",
"min_similarity": 0.75,
"boost": 5
}
}
}, {
"fuzzy": {
"recipes.ingredients": {
"value": "cherry cheesecake",
"min_similarity": 0.75
}
}
}, ]
}
}]
}
},
"facets": {
"filtered_sections": {
"terms": {
"field": "_type"
}
},
"is_express": {
"terms": {
"field": "recipes.is_express"
}
},
"is_vegetarian": {
"terms": {
"field": "recipes.is_vegetarian"
}
},
"mood": {
"terms": {
"field": "recipes.moods"
}
},
"occasion": {
"terms": {
"field": "recipes.occasions"
}
}
},
"filter": [{
"terms": {
"_type": ["recipes"]
}
}]
}
I've been tearing my hair out to this point to get search returning any
kind of accurate results, anf this is the first time I have good results,
but then find out that it doesn't even work for a perfect match.
Any help anyone has would be greatly appreciated.
John.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.