Hi all,
Am having my data with different analyzers as shown in the following mappings:
PUT ymme/ymme_type/_mappings
{
"ymme_type": {
"_all": {
"auto_boost": true,
"index_analyzer": "wordAnalyzer",
"search_analyzer": "whitespace_analyzer"
},
"properties": {
"Engine": {
"type": "string",
"index": "not_analyzed"
},
"EngineCode": {
"type": "string",
"include_in_all": false
},
"Make": {
"type": "string",
"boost": 3,
"index": "not_analyzed",
"norms": {
"enabled": true
}
},
"MakeCode": {
"type": "string",
"include_in_all": false
},
"Model": {
"type": "string",
"boost": 2,
"index": "not_analyzed",
"norms": {
"enabled": true
}
},
"ModelCode": {
"type": "string",
"include_in_all": false
},
"ShortYear": {
"type": "string",
"boost": 4,
"index": "not_analyzed",
"norms": {
"enabled": true
}
},
"Year": {
"type": "string",
"boost": 5,
"index": "not_analyzed",
"norms": {
"enabled": true
},
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
},
"YearCode": {
"type": "string",
"include_in_all": false
}
}
}
}
In ShortYear field I kept last two digits of the Year, when we type last two digits need to show the records based on that year.
The reason that I have used _all here to search on multiple fields and those fields may have special characters, as QueryString doesnt support special characters inorder to use match query, I kept the analyzers in _all field.
Boost values I have taken inorder to boost that particular field with more than the other fields.
And my analyzers are as shown in settings:
"analysis": {
"analyzer": {
"analyzer_startswith": {
"type": "custom",
"filter": "lowercase",
"tokenizer": "keyword"
},
"whitespace_analyzer": {
"type": "custom",
"filter": [
"lowercase",
"asciifolding"
],
"tokenizer": "whitespace"
},
"wordAnalyzer": {
"type": "custom",
"filter": [
"lowercase",
"asciifolding",
"nGram_filter"
],
"tokenizer": "whitespace"
}
},
"filter": {
"nGram_filter": {
"max_gram": "20",
"min_gram": "1",
"type": "nGram",
"token_chars": [
"letter",
"punctuation",
"symbol",
"digit"
]
}
}
}
Here am using n-gram filter as my search should not be in order, search may be in random fields..
I preferred a query for this is as shown below:
GET testymme/ymme_type/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"_all":
{
"query": "2012",
"operator": "and"
}
}
}
]
}
}
}
Here am also doing sorting using Java API. My intension is am not getting the data relevant to the Search As You Type
Concept.. And when am adding fuziness to the query my intension of boosting is not working and am getting different data.
My sample data is:
"2012", "AM GENERAL-VPG", "MV-1 V8-281", "4.6L SOHC"
"2012", "CHEVROLET", "CAMARO", "All Engine"
"2012", "CHEVROLET", "CAMARO", "v6-3564 3.6L", "DOHC"
"2012", "CHEVROLET", "CAMARO", "V8-376", "4.6L"
"2012", "LAMBORGHINI", "AVENTADOR", "12-654 6.5L", "DOHC"
"2012", "LAMBORGHINI", "GALLARDO", "10-520", "DOHC"
Like this which is a combination of Year, make, model, Engine. In this way I have Years from 1962 to 2015 and different makes for them and models for those makes and engines for those models.
When am searching with the above query am getting the result with the 2012 data but when am using the following query:
"query": {
"bool": {
"must": [
{
"match": {
"_all":
{
"query": "2012",
"operator": "and",
"fuzziness": 1,
"prefix_length": 1
}
}
}
]
}
}
}
Am getting the data with year not based on 2012, getting the documents with different years I dont know what is happening, can anyone help me out to resolve this issue..