Fuzziness query et nom composé

maryline · July 13, 2017, 9:19am

Bonjour,
J'ai un problème avec un fuzziness query pour retrouver des noms composés.
Version ES : 5.4.1
Ci dessous le scénario
Je voudrais si j'ai un document "dupont"
Le trouver en recherchant "dupond martin"

    # creation du template avec maping et analyzer
    PUT _template/test-template
    {
    	"template": "test-*",
    	"settings": {
    		"number_of_shards": 5,
    		"number_of_replicas": 2,
    		"analysis": {
    			"analyzer": { 			
    				"test-analyzer": {
    					"type": "custom",
    					"tokenizer": "keyword",
    					 "char_filter" : ["mapping_test"],
    					"filter": ["trim",
    					"lowercase",
    					"asciifolding",
    					"pattern_test",
    					"length-test",
    					"stop_test"]
    				}
    			},
    						"char_filter":{
    				"mapping_test" :{
    				 "type": "pattern_replace",
    			          "pattern": ",|-|_|'",
    			           "replacement" : " "
    			            				}},
    			"filter": {
    				"length-test": {
    					"type": "length",
    					"min": 2
    				},
    				"stop_test": {
    					"type": "stop",
    					"stopwords": ["dit",
    					"da",
    					"de",
    					"del",
    					"des",
    					"du",
    					"di",
    					"dos",
    					"alias",
    					"le",
    					"el",
    					"la",
    					"van",
    					"der",
    					"dite",
    					"ou",
    					"st",
    					"ste"]
    				},
    				"pattern_test": {
    					  "type" : "pattern_capture",       
    					  "preserve_original" : true,        
               "patterns" : [
                  "(\\p{Ll}+|\\p{Lu}\\p{Ll}+|\\p{Lu}+)"                 
               ]
    				}
    			}
    		}
    	},
    	"mappings": {
    		"personne": {
    			"date_detection": false,
    			"numeric_detection": false,			
    			"_all": {
    				"enabled": false
    			},
    			"properties": {
    				"nom": {
    					"type": "text",
    					"analyzer": "test-analyzer",
    					"norms": false,
    					"index_options": "docs",
    					"fielddata": true
    				}
    			}
    		}
    	}
    }

# creation de l'index
PUT test-nom

# ajout d'un doc
PUT test-nom/personne/1
{
 "nom" : "dupont"
}

# Recherche nom composé avec un erreur sur le nom d au lieu de t
GET test-nom/_search
  {"query" : {
  "bool" : {
      "must": [
              {
                "match" : {
                    "nom" : {
                        "query" : " dupond martin",                            
                        "analyzer" : "test-analyzer",
                        "fuzziness" : "2",
                        "operator": "and", 
                        "prefix_length" : 0,
                        "max_expansions" : 1000,
                        "fuzzy_transpositions" : false,
                        "lenient" : false,
                        "zero_terms_query" : "NONE",
                          "boost" : 1.0
                          }
                    }
                
              }]
  }
  }  
  
  }


GET test-nom/_analyze
{
  "analyzer" : "test-analyzer",
  "text" : "dupond martin"
}


On voit bien que "dupond martin" est indexé selon les token 
"dupond"
"martin"
"dupond martin"

Je ne comprends pas pour quoi "dupont" ne match pas avec "dupond" avec un fuzziness de 2 ?

system · August 10, 2017, 9:19am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Fuzzy match query unexpected results Elasticsearch	2	1399	March 17, 2016
Fuzziness is not given results in Match query Elasticsearch	1	816	July 22, 2015
Match Query Fuzziness Elasticsearch	3	775	September 21, 2021
Fuzzy search can't get result Elasticsearch	0	455	December 9, 2015
Misspelled words or Typo Mistakes handling in Elastic Search without fuzziness Elastic Search elastic-app-search	15	5041	January 18, 2024

Fuzziness query et nom composé

Related topics