Exact search getting less precedence then phonetic?

I have an elasticsearch index and am using the following query:

    "_source": [
        "title", 
        "content"

    ],
    "size": 15,
    "from": 0,
    "query": {
        "bool": {
            "must": {
                "multi_match": {
                    "query": "{{query}}",
                    "fields": [
                        "title",
                        "content"
                    ],
                    "operator": "or"
                }
            },
            "should": [
                {
                    "multi_match": {
                        "query": "{{query}}",
                        "fields": [
                            "title.standard^16",
                            "content.standard^2"
                        ],
                        "operator": "and"
                    }
                },
                {
                    "match_phrase": {
                        "content.standard": {
                            "query": "{{query}}",
                            "_name": "Phrase on title",
                            "boost": 1000
                        }
                    }
                }
            ]
        }
    },
    "highlight": {

        "fields": {
            "content": {}
        },
        "fragment_size": 100
    }
}

Here is the mapping I set:

{
    "settings": {
        "index": {
            "analysis": {
                "analyzer": {
                    "my_analyzer": {
                        "tokenizer": "standard",
                        "filter": [
                            "lowercase",
                            "my_metaphone"
                        ]
                    }
                },
                "filter": {
                    "my_metaphone": {
                        "type": "phonetic",
                        "encoder": "metaphone",
                        "replace": true
                    }
                }
            }
        }
    },
    "mappings": {
        "properties": {
            "title": {
                "type": "text",
                "term_vector": "with_positions_offsets",
                "analyzer": "my_analyzer",
                "fields": {
                    "standard": {
                        "type": "text"
                    }, 
                    "stemmer": {
                        "type": "text", 
                        "analyzer": "english"  
                    }
                }
            },
            "content": {
                "type": "text",
                "term_vector": "with_positions_offsets",
                "analyzer": "my_analyzer",
                "fields": {
                    "standard": {
                        "type": "text"
                    }, 
                    "stemmer": {
                        "type": "text", 
                        "analyzer": "english"  
                    }
                }
            }
        }
    }
}

Here is my logic with the query:

  1. It will give the highest precedence to a phrase if it appears.

  2. If not it will use the standard analyzer (that is the text, as is) and give it the highest precedence.

  3. If all else doesn't match up, it will use the phonetic analyzer to get the results, that is the least precedence.

But obviously there is some fault to this as it seems to give higher precedence to the phonetic analyzer than the standard or phrase. For example, if I search for "Person of Indian Origin" it returns results on the top highlighting "Pursuant" "pursuing" and very, very less number of results with person of Indian origin although I know a large number of them exists. How do I solve this?

Here is some sample data to test it out - https://pastebin.com/mzfwz0b3

You can combine multiple searches within a bool query in should clauses.
I have an example here which might help:

Hi @dadoonet I did try this out but did not get desired results. Having a bunch if should statements and then boosting them did not work well.

Could you provide a full recreation script as described in About the Elasticsearch category. It will help to better understand what you are doing. Please, try to keep the example as simple as possible.

A full reproduction script will help readers to understand, reproduce and if needed fix your problem. It will also most likely help to get a faster answer.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.