In Elasticsearch I have an index with documents containing a text type field "name" containing first name(s) and surname, e.g .:
Rebecca Donovan
Julian Fred Drake
Miley Angela Saunders
...
and a query with a text field, e.g.:
Mr. Julian second name Fred surname Drake living in Boston
Now I need to find documents in the index, but those for which all the words in the "name" field are contained in the text of the query, and the word comparisons must be fuzzy. The edit distance in fuzzy should be specified by a client in the query.
I would have a question, is it even possible with Elasticsearch?
I made a workaround solution, but it has some disadvantages. Query example for this solution assuming the maximum number of words in the list document is 3:
curl -H "Content-Type: application/json" -XPOST '127.0.0.1:9200/person-index/_search?pretty&size=10' -d '
{
"query": {
"bool": {
"should": [
{
"bool": {
"must": {
"match": {
"name": {
"query": "Mr. Julian second name Fred surname Drake living in Boston",
"minimum_should_match": 1,
"fuzziness": 1
}
}
},
"filter": { "term" : {"name.length": 1} }
}
},
{
"bool": {
"must": {
"match": {
"name": {
"query": "Mr. Julian second name Fred surname Drake living in Boston",
"minimum_should_match": 2,
"fuzziness": 1
}
}
},
"filter": { "term" : {"name.length": 2} }
}
},
{
"bool": {
"must": {
"match": {
"name": {
"query": "Mr. Julian second name Fred surname Drake living in Boston",
"minimum_should_match": 3,
"fuzziness": 1
}
}
},
"filter": { "term" : {"name.length": 3} }
}
}
],
"minimum_should_match": 1
}
}
}
'
mappings:
{
"mappings": {
"properties": {
"name": {
"type": "text",
"analyzer": "whitespace",
"fields": {
"length": {
"type": "token_count",
"analyzer": "whitespace",
"store": true
}
}
}
}
}
}
The query is long and depends on the maximum number of words in the index documents.
Additionally, for the above example, the following query will meet the assumptions with the second record of the list (but it shouldn't):
Mr. Julian second name Jullan surname Drake living in Boston
I would have a question, is it possible to meet these requirements in a simpler and better way?