Custom search analyzer problem


(ciro) #1

Hi, i'm using es 5.0 rc1 and this is my use-case:

I have 2 custom analyzer that i want use only when i do a search but i don't want set them on field so:

mappings:

{
	"settings": {
		"number_of_shards": 5,
		"number_of_replicas": 1,
		"index": {
			"refresh_interval": "1s",
			"mapper.dynamic": false,
			"analysis": {
				"filter": {
					"italian_elision": {
						"type": "elision",
						"articles": ["c","l","all","dall","dell","nell","sull","coll","pell","gl","agl","dagl","degl","negl","sugl","un","m","t","s","v","d"]
					},
					"italian_stop": {
						"type": "stop",
						"stopwords": "_italian_"
					},
					"italian_stemmer": {
						"type": "stemmer",
						"language": "italian"
					}
				},
				"analyzer": {
					"an": {
						"type": "standard"
					},
					"it": {
						"tokenizer": "standard",
						"filter":[
							"italian_elision",
							"lowercase",
							"italian_stop",
							"italian_stemmer"
						]
					},
"es":{...},
"en":{...}
				}
			}
		}
	},
	"mappings": {
		"test": {
			"dynamic": "strict",
			"properties": {
				"iso": {
					"type": "keyword"
				},
				"title": {
					"type": "keyword",
					"fields": {
						"it": { "type": "text", "analyzer": "it"}
					}
				},
				"description": {
					"type": "keyword",
					"fields": {
						"an": { "type": "text", "analyzer": "an"}
					}
				}
			}
		}
	}
}

query:

{
  "query": {
    "bool":{"must":[
    {"match": {
      "description":{
        "query":"addetto",
        "analyzer": "it"
    }}},
    {"match": {
      "iso": "IT"
    }}
  ]}},
  "highlight": {
    "fields": {
      "iso": {},
      "title": {}
    }
  }
}

if i launch the query i have no result, in the previous release i can do this.

if i search with title.it its work but i my case i want a dynamic call for analyzer.

I don't know why this not work for me now. I'm wrong? or what?


(Christoph) #2

Hi,

there's a conceptual problem here. Your can change the "search_analyzer" that is used to analyze the query itself, but the index time analysis that all documents have to go through cannot be changed dynamically. If you want your query to be able to work with different types of analyzers you need to set up something like Multi Fields and then switch to whatever variant you need to at search time.


(Mark Harwood) #3

Your query is targeting the untokenized keyword field "description" (why is this untokenized?)

Your search is using an analyzer that stems addetto to addett, hence no match.

I suspect you want to search a "description.it" field here.


(ciro) #4

hi, i know what you mean and i don't want set analyzer on field, but i want to set analyzer on index and than use it only on search time like this query


(ciro) #5

i need keyword as default becouse i have two basic query for this file one without analyzer and one with analyzer, yes, i want somethings like "description.it". If i use another analyzer as my "an" ->standard analyzer i have no result


(Mark Harwood) #6

Here's my reproduction with some debug hints thrown in:

DELETE test

//Note added description.it field and using fielddata:true on this small index so I can debug contents.
PUT test
{
	"settings": {
		"number_of_shards": 1,
		"number_of_replicas": 0,
		"index": {
			"refresh_interval": "1s",
			"mapper.dynamic": false,
			"analysis": {
				"filter": {
					"italian_elision": {
						"type": "elision",
						"articles": ["c","l","all","dall","dell","nell","sull","coll","pell","gl","agl","dagl","degl","negl","sugl","un","m","t","s","v","d"]
					},
					"italian_stop": {
						"type": "stop",
						"stopwords": "_italian_"
					},
					"italian_stemmer": {
						"type": "stemmer",
						"language": "italian"
					}
				},
				"analyzer": {
					"an": {
						"type": "standard"
					},
					"it": {
						"tokenizer": "standard",
						"filter":[
							"italian_elision",
							"lowercase",
							"italian_stop",
							"italian_stemmer"
						]
					}
				}
			}
		}
	},
	"mappings": {
		"test": {
			"dynamic": "strict",
			"properties": {
				"iso": {
					"type": "keyword"
				},
				"title": {
					"type": "keyword",
					"fields": {
						"it": { "type": "text", "analyzer": "it"}
					}
				},
				"description": {
					"type": "keyword",
					"fields": {
						"an": { "type": "text", "analyzer": "an"},
						"it": { "type": "text", "analyzer": "it", "fielddata": true}                        
					}
				}
			}
		}
	}
}

//Index one doc	
POST test/test
{
	"iso":"IT",
	"description":"addetto"
}

//Show index contents
GET test/_search
{
	"aggs":{
		"terms":{
			"terms":{
			"field":"description.it"}
		}
	}
}

//Run search, note use of explain API
GET test/_search
{
   "explain": true, 
  "query": {
	"bool":{"must":[
	{"match": {
	  "description.it":{
		"query":"addetto"
	}}},
	{"match": {
	  "iso": "IT"
	}}
  ]}},
  "highlight": {
	"fields": {
	  "iso": {},
	  "title": {}
	}
  }
}

(ciro) #7

yes, all correct....but i don't want this:

> "description": {
> 					"type": "keyword",
> 					"fields": {
> 						"an": { "type": "text", "analyzer": "an"},
> 						"it": { "type": "text", "analyzer": "it", "fielddata": true}                        
> 					}
> 				}

I have about 25 field on this index and about 13 analyzer, i can't set all analyzer for all field at index time....but i want call one analyzer only on query...this code is on the documentation:

{
    "query": {
        "match_phrase" : {
            "message" : {
                "query" : "this is a test",
                "analyzer" : "my_analyzer"
            }
        }
    }
}

so i can set analyzer on settings, and call it when querying...but now on this version of es don't work


(Mark Harwood) #8

Sorry, I'm unclear as to what used to work and is now broken.
Can you provide the before and after commands to demonstrate the issue?


(ciro) #9

mmm now i have only one server with the last version of elastic....i'll try to do something for this.


(system) #10