Exact search

I have a PHP script that performs multiple queries that are mostly identical except the words being queried. They get the query words from a variable in PHP. They look like this:

$q = $_GET['q'];
$query = $es->search([
    'index' => 'aksjeregisteret2017',
    'size' => 20,
    'body' => [
        'query' => [
            'bool' => [
                'must' => [
                    'multi_match' => [
                        'query' => $q,
		    'type' => "phrase_prefix",
                        'fields' => ['message', 'Navn Aksjoner', 'Orgnr', 'Selskap',
                            'Aksjeklasse', 'Navn Aksjoner', 'Fodselar/Orgnr', 'Postnr',
                            'Antall Aksjer', 'Antall Aksjer Selskap'],
                        "minimum_should_match" => "100%"
                    ]
                ]
            ]
        ]
    ]
]);

}

Now this works fine. It returns the results and I'm able to do whatever I want with it.

The problem is that it also returns documents that does not contain exactly what I'm searching. For example when searching for "Hopland AS" I find the documents containing "Hopland AS" but I also find documents containing that does not contain it, but instead contains words such as "Hopland Geir Atle" and "Geir Hopland AS".

What I want is the query to not return the documents that does not contain exactly "Hopland AS" in any of their fields. Is this feasible? Any other information I can provide to help find a solution?

I'm using Elasticsearch 6.3.2. The PHP is running on Apache2 with PHP 7.2.

You may rewrite the query as ("Hopland" AND "AS"). ES will search the record
the contain both the words of "Hopland" and "AS"

By doing this, would it also return results that contain fields with a value of "Hopland", "AS", and more? So would it return "Hopland Kraft AS"? Because we only want "Hopland AS" and not "Hopland Kraft AS".

You need to do a match_phrase query with slop 0. This will ensure that position offsets of "AS" succeed "Hopland" and the slop 0 will ensure that there are no tokens/words between them.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.