Hi,
I'm getting weird match with string query.
This is what explain returns:
{
"_index": "unit",
"_type": "unit",
"_id": "6wnIsmUBR6RhLqpGBfHA",
"matched": true,
"explanation": {
"value": 1,
"description": "sum of:",
"details": [
{
"value": 1,
"description": "*:*",
"details": []
}
]
}
}
This is string which is matching:
Technik údržby těžní jámy koordinuje práce při zajišťování báňské údržby a provozu těžní jámy.
_analyze:
technik, technika, udrzba, tezni, jama, koordinovat, prace, zajistovani, bansky, udrzba, provoz, tezni, jama
with this query:
{
"query": {
"query_string": {
"analyzer": "cz",
"analyze_wildcard": true,
"query": "*daty* OR *data* OR *výpoč* OR *dat * OR *datový* OR *datové* OR *datová*"
}
}
}
_analyze:
datum, datum, vypoc, datum, datovy, datovy, datovy,
I don't see anything that should match with this.
this is my settings and mappings:
{
"properties": {
"db_id": {
"type": "integer"
},
"characteristics": {
"type": "text",
"analyzer": "cz",
"index": true
}
},
"settings": {
"analysis": {
"analyzer": {
"cz": {
"type": "custom",
"tokenizer": "icu_tokenizer",
"filter": [
"icu_normalizer",
"cz_stop",
"cz_length",
"standard",
"lowercase",
"cz_stop",
"cs_CZ",
"icu_folding",
"cz_unique"
]
}
},
"filter": {
"cz_stop": {
"type": "stop",
"stopwords": [
"právě",
"že",
"_czech_"
],
"ignore_case": true
},
"cs_CZ": {
"type": "hunspell",
"locale": "cs_CZ",
"dedup": true,
"recursion_level": 0
},
"remove_duplicities": {
"type": "unique",
"only_on_same_position": true
},
"cz_length": {
"type": "length",
"min": 2
},
"cz_unique": {
"type": "unique",
"only_on_same_position": true
},
"cz_collation": {
"type": "icu_collation",
"language": "cs"
}
}
}
}
}