Match query with synonym token filter where synonym available on first & second paragraph only

Yulius_Ardian_Febria · May 28, 2020, 10:53am

Hi everyone ! I'm new on elasticsearch things...

I have an index with settings like this :

{
    "settings": {
        "index": {
            "analysis": {
                "analyzer": {
                    "article_analyzer": {
                        "filter": [
                            "synonym",
                            "stop"
                        ],
                        "tokenizer": "whitespace"
                    }
                },
                "filter": {
                    "synonym": {
                        "type": "synonym",
                        "synonyms_path": "/etc/elasticsearch/synonyms.txt"
                    },
                    "stop": {
                        "ignore_case": "true",
                        "type": "stop",
                        "stopwords_path": "/etc/elasticsearch/stopwords.txt"
                    }
                }
            }
        }
    }
}

and the /etc/elasticsearch/synonyms.txt mapping is something like this :

hiv => Human immunodeficiency virus

and my current query is like this :

{
    "query": {
        "multi_match": {
            "query": "hiv",
            "analyzer": "article_analyzer",
            "fields": [
                "content",
                "title",
                "slug"
            ]
        }
    }
}

The result will be something like :

{
    "took": 3,
    "timed_out": false,
    "_shards": {
        "total": 2,
        "successful": 2,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 5,
        "max_score": 14.4982815,
        "hits": [
            {
                "_index": "article",
                "_type": "article",
                "_id": "732",
                "_score": 14.4982815,
                "_source": {
                    "slug": "some-slug",
                    "title": "Some title",
                    "content": "<p>Human immunodeficiency virus is lorem ipsum dolor sit amet</p>\n\n<p>some lorem ipsum dolor sit amet Human immunodeficiency virus</p>\n",
                    "url": "https://example.tld/some-slug"
                }
            },
            {
                "_index": "article",
                "_type": "article",
                "_id": "704",
                "_score": 13.077797,
                "_source": {
                    "slug": "some-some-slug",
                    "title": "Some some slug",
                    "content": "<p>Lorem ipsum dolor sit amet.</p>\n\n<p>Aliquam id purus mi. Suspendisse vitae aliquet velit.</p>\n\n<p>Human immunodeficiency virus is something.</p>",
                    "url": "https://example.tld/some-some-slug"
                }
            }
        ]
    }
}

the result that I wanted is only document with _id : 732 because synonym matched in first html tag p and the second one.
and the document with _id : 704 is matched synonym but started in third html tag p and should'nt be appread in the result....

I know there is some much ways to solved this problem, but I just wanted to know if this is possible to solve in the elasticsearch way without update the document structure

cbuescher · May 29, 2020, 8:14am

If the paragraphs in your input document have some specific meaning (e.g. you want some to match a search but not others) you need to split the document and index it into different fields. That way you can query only fields that you want to match.

Yulius_Ardian_Febria · May 29, 2020, 10:57am

Exactly, I already think about it.....

So the only way to solve this problem is split it into different fields.

Hmmm, okayyyy thanks for your reply.

system · June 26, 2020, 10:57am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Help with Synonyms Elasticsearch	6	484	July 6, 2017
Synonyms in a query Elasticsearch	7	1330	July 6, 2017
Multi-term synonyms: How can this be used in practice? Elasticsearch	6	2985	April 8, 2020
Synonym Token Filter questions Elasticsearch	10	1211	September 26, 2019
How to search with synonym analyzer Elasticsearch	4	2493	December 29, 2016

Match query with synonym token filter where synonym available on first & second paragraph only

Related topics