Unable to maintain proper SAYT search results with multi field multi word data


(anusha) #1

Hi,
I would like to achieve proper SAYT(Search As You Type) search results like.

  1. When we give a letter need to get the data using that letter at starting position (unable to achieve this)
  2. The SAYT data should be in sorted order ( I covered this part).

My intention is to search data in multiple fields and those fields have multiple words.

  1. Starting I preferred QueryString , where in general query string will search on multiple fields and we can give priorities to those fields and we can use 'AND' /'OR' operations.

  2. But failed at special characters like ( '/', ',', '.', '(', ')' ). In order to use these characters we have to escape these characters, but that is not yielding accurate results.

  3. Where match queries can accept any type of characters, decided to use that and I used following analyzers in order to make match query to work on any situation.

Here are my settings:

"analysis": {
"analyzer": {
"analyzer_startswith": {
"type": "custom",
"filter": "lowercase",
"tokenizer": "keyword"
},
"whitespace_analyzer": {
"type": "custom",
"filter": [
"lowercase",
"asciifolding"
],
"tokenizer": "whitespace"
},
"wordAnalyzer": {
"type": "custom",
"filter": [
"lowercase",
"asciifolding",
"nGram_filter"
],
"tokenizer": "whitespace"
}
},
"filter": {
"nGram_filter": {
"max_gram": "20",
"min_gram": "1",
"type": "nGram",
"token_chars": [
"letter",
"punctuation",
"symbol",
"digit"
]
}
}
}

Here are my mappings :

"ymme": {
"mappings": {
"ymme_type": {
"_all": {
"auto_boost": true,
"index_analyzer": "wordAnalyzer",
"search_analyzer": "whitespace_analyzer"
},
"properties": {
"Engine": {
"type": "string",
"index": "not_analyzed"
},
"EngineCode": {
"type": "string",
"include_in_all": false
},
"Make": {
"type": "string",
"boost": 3,
"index": "not_analyzed",
"norms": {
"enabled": true
}
},
"MakeCode": {
"type": "string",
"include_in_all": false
},
"Model": {
"type": "string",
"boost": 2,
"index": "not_analyzed",
"norms": {
"enabled": true
}
},
"ModelCode": {
"type": "string",
"include_in_all": false
},
"ShortYear": {
"type": "string",
"boost": 4,
"index": "not_analyzed",
"norms": {
"enabled": true
}
},
"Year": {
"type": "string",
"boost": 5,
"index": "not_analyzed",
"norms": {
"enabled": true
},
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
},
"YearCode": {
"type": "string",
"include_in_all": false
}
}
}
}
}

Now I used query as :

GET ymme/ymme_type/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"_all":
{
"query": "2",
"operator": "and"
}
}
}
]
}
}
}

Every thing is working fine......
But problem is with words starting with:
when I give my query as "query": "2000 Audi",

  1. Here '2000' belongs to year / 'Audi' belongs to Make, so when I am entering the letter 'A' my search result is not showing the data starting with a, searching is done from middle of the word. I know the reason that because of my analyzers.
  2. If I use match_phrase_prefix without these analyzers an every thing am unable to get the data with query:"2000 a", and I have to give the data in indexing order like

query:" "Year" "Model" "ShortYear" "Engine" "Make" ", in this order I have search if i miss the order am unable to retrive the data with match_phrase_prefix....

The analyzers that I mentioned in my settings tried all those but unable to achieve my requirement..

Can anyone give me a solution to my problem that need to get the data starting with and need to search in multiple fields multiple words by preserving special characters..


(system) #2