HI,
I have an index with million documents from all kind of media in all kind of languages.
I set up my mapping with multiple languages analyzed fields for both title and body of an article, the default field uses standard analyzer.
The problem is when I use Bulgarian analyzed field for body in MLT query, I get zero hits.
here is my query:
GET my_index/_search
{
"query": {
"bool": {
"must": [
{
"more_like_this": {
"fields": [
"title.bg",
"body.bg"
],
"like": [
{
"_id": 167594917 // article in Bulgarian about sport event
}
],
"min_term_freq": 10,
"max_query_terms": 50
}
}
]
}
}
}
When I use the default standard analyzed filed for body I get millions of results.
I tested MLT query for article in Romanian using title.ro
and body.ro
I get results.
My custom bg analyzer is defined like this:
"bulgarian": {
"filter": [
"lowercase",
"stop_bg",
"bulgarian_stemmer"
],
"char_filter": [
"html_strip"
],
"type": "custom",
"tokenizer": "standard"
}