Hi everyone ! I'm new on elasticsearch things...
I have an index with settings like this :
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"article_analyzer": {
"filter": [
"synonym",
"stop"
],
"tokenizer": "whitespace"
}
},
"filter": {
"synonym": {
"type": "synonym",
"synonyms_path": "/etc/elasticsearch/synonyms.txt"
},
"stop": {
"ignore_case": "true",
"type": "stop",
"stopwords_path": "/etc/elasticsearch/stopwords.txt"
}
}
}
}
}
}
and the /etc/elasticsearch/synonyms.txt
mapping is something like this :
hiv => Human immunodeficiency virus
and my current query is like this :
{
"query": {
"multi_match": {
"query": "hiv",
"analyzer": "article_analyzer",
"fields": [
"content",
"title",
"slug"
]
}
}
}
The result will be something like :
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 5,
"max_score": 14.4982815,
"hits": [
{
"_index": "article",
"_type": "article",
"_id": "732",
"_score": 14.4982815,
"_source": {
"slug": "some-slug",
"title": "Some title",
"content": "<p>Human immunodeficiency virus is lorem ipsum dolor sit amet</p>\n\n<p>some lorem ipsum dolor sit amet Human immunodeficiency virus</p>\n",
"url": "https://example.tld/some-slug"
}
},
{
"_index": "article",
"_type": "article",
"_id": "704",
"_score": 13.077797,
"_source": {
"slug": "some-some-slug",
"title": "Some some slug",
"content": "<p>Lorem ipsum dolor sit amet.</p>\n\n<p>Aliquam id purus mi. Suspendisse vitae aliquet velit.</p>\n\n<p>Human immunodeficiency virus is something.</p>",
"url": "https://example.tld/some-some-slug"
}
}
]
}
}
the result that I wanted is only document with _id : 732 because synonym matched in first html tag p and the second one.
and the document with _id : 704 is matched synonym but started in third html tag p and should'nt be appread in the result....
I know there is some much ways to solved this problem, but I just wanted to know if this is possible to solve in the elasticsearch way without update the document structure