Search on fields with Multi field mapping and Ngram Analyzer


(sunshine) #1

Multi field Mapping with Analyzer

{
"settings" : {
"analysis" : {
"analyzer" : {
"my_ngram_analyzer" : {
"tokenizer" : "my_ngram_tokenizer",
"filter":"lowercase"
}
},
"tokenizer" : {
"my_ngram_tokenizer" : {
"type" : "nGram",
"min_gram" : "3",
"max_gram" : "50"
}
}
}
},
"mappings": {
"allpy": {
"properties": {
"id": {
"type": long
}
"FIRSTNAME": {
"path": "just_name"
"type": "multi_field"
"fields": {
"FIRSTNAME": {
"store": true
"type": "string",
"analyzer": "my_ngram_analyzer"
}
"COMBOFIELDS": {
"include_in_all": false
"store": true
"type": "string",
"analyzer": "my_ngram_analyzer"
}
}
}
"middlename": {
"type": "string"
}
"LASTNAME": {
"path": "just_name"
"type": "multi_field"
"fields": {
"COMBOFIELDS": {
"include_in_all": false
"store": true
"type": "string",
"analyzer": "my_ngram_analyzer"
}
"LASTNAME": {
"store": true
"type": "string",
"analyzer": "my_ngram_analyzer"
}
}
}
}
}
}
}

Data indexed

{"id":1,"FIRSTNAME":"John","middlename":"clark","LASTNAME":"Doe"}
{"id":2,"FIRSTNAME":"John","middlename":"clark","LASTNAME":"Dave"}
{"id":3,"FIRSTNAME":"Jane","middlename":"clark","LASTNAME":"Dave"}

Query with query string gives expected results i.e. id = 2 for query below

Query 1
{
"query" : {
"query_string" : {
"fields" : [
"FIRSTNAME",
"LASTNAME"
],
"query" : "john dave",
"default_operator" : "AND"
}
}
}

But the same does not work with multi match

Query 2
{
"query" : {
"filtered" : {
"query" : {
"multi_match" : {
"query" : "sangita saxena",
"fields" : [ "COMBOFIELDS" ],
"operator" : "OR",
"minimum_should_match" : "100%"

    }
  }
}

}
}

If I remove the analyzer it does give me the expected result

Explanation for Query 1
explanation: +((+FIRSTNAME:joh +FIRSTNAME:john +FIRSTNAME:ohn) | (+LASTNAME:joh +LASTNAME:john +LASTNAME:ohn)) +((+FIRSTNAME:dav +FIRSTNAME:dave +FIRSTNAME:ave) | (+LASTNAME:dav +LASTNAME:dave +LASTNAME:ave))

Explanation for Query 2
explanation: (COMBOFIELDS:joh COMBOFIELDS:john COMBOFIELDS:john COMBOFIELDS:john d COMBOFIELDS:john da COMBOFIELDS:john dav COMBOFIELDS:john dave COMBOFIELDS:ohn COMBOFIELDS:ohn COMBOFIELDS:ohn d COMBOFIELDS:ohn da COMBOFIELDS:ohn dav COMBOFIELDS:ohn dave COMBOFIELDS:hn COMBOFIELDS:hn d COMBOFIELDS:hn da COMBOFIELDS:hn dav COMBOFIELDS:hn dave COMBOFIELDS:n d COMBOFIELDS:n da COMBOFIELDS:n dav COMBOFIELDS:n dave COMBOFIELDS: da COMBOFIELDS: dav COMBOFIELDS: dave COMBOFIELDS:dav COMBOFIELDS:dave COMBOFIELDS:ave)~28

Whats the best way to design this query to get the expected results ?
Either by using individual field names "fields" : [ "FIRSTNAME", "LASTNAME" ] or
by using the multi-field name "fields" : [ "COMBOFIELDS" ]

Our requirement is not just for full words. i.e. search string "john dave" should work same as "joh dave" or "john dav"

Thanks much


(system) #2