Multi match query with non fully analyzed fields

ugolas · August 4, 2016, 7:11am

Hi,

We need to implement a free text search, so that a user can search a string and we need to return the docs which have this string in one of multiple fields..

So I've written a multi match, cross fields query on the required fields

Lets say the search is for "Alex"

and the query is:

        "multi_match": {
          "query": "Alex",
          "type": "cross_fields",
          "fields": [
            "name",
            "company",
            "email"
          ],

Now, name field is standard fully analyzed, company is not_analyzed (only exact matches should return) and email field is analyzed with tokenizer keyword and filter lowercase (I need exact non case sensitive match of email).

The weird behaviour is when I search for multiple terms which do not exist:

If I search for "Alex Facebook" - I expect to get all docs that the 3 fields above contain either "Alex" or "Facebook", and it there are no matches for "Facebook", I still expect to get the docs which match to "Alex".

But, if I search for a value which matches one of the not fully analyzed fields with another value which does not exists - I get no results.

Example:

Query: "Amazon James"
There is a doc which company = "Amazon", but there isn't any match for "James" - no result returns.

Can someone explain this behavior and how I can overcome it?

davidbkemp · August 8, 2016, 9:39am

cross_fields don't work all that well when the fields have different analysers.

"If you include fields with a different analysis chain, they will be added to the query in the same way as for best_fields"
https://www.elastic.co/guide/en/elasticsearch/guide/2.x/_cross_fields_queries.html

Given you have specified "not_analaysed" for "company", it will be using the full untokenized query and only match companies actually named "Amazon James".

ugolas · August 8, 2016, 10:25am

Hi,

Thanks for the reply. Is there a nice way to achieve what I aiming for in one query?

I've thought about splitting the 'multi_match' queries between the different analysed fields and assembling them together under 'should'.

davidbkemp · August 8, 2016, 10:54am

Yes, in a similar situation, I've seen a bool/should query work OK. But a problem you need to address first is that you will have trouble getting a string like "Amazon James" to match "Amazon" on a "not_analysed" field. You might get away with specifying a query time analyser consisting of a standard tokenizer and no token filters, but then no queries would match multi-word company names like "Mercedes Benze". There might be some clever things you could do using shingle token filters in the query analyser to work around this, but it depends on what your requirements really are.

ugolas · August 9, 2016, 10:33am

One last thing, I've did some testing and with seems like using query_string instead of multi_match works on the search ''Amazon James" on a non-analyzed field where only Amazon matches..

Why is the difference in behavior between query_string and multi_match?

davidbkemp · August 9, 2016, 12:11pm

I haven't used query_string much, but I'd be interested in seeing some examples that you have managed to get working.

ugolas · August 9, 2016, 3:52pm

From what I understood, query_string by default splits the entire query to multiple terms by spaces and applies OR operator between them. So basically "Amazon James" is analyzed separately as "Amazon" and "James", for companies as Mercedes Benz they should be passed to the query as "Mercedes Benz" and then this will be the whole term.

davidbkemp · August 9, 2016, 10:03pm

Are you specifying specific fields in the query, or are you leaving it as the default "_all" field? I ask this because I suspect that everything in the "_all" field is analysed using the "standard" analyser, and so "Benze Mercedes" would also match "Mercedes Benze".

ugolas · August 10, 2016, 5:27am

Specifying a closed list of fields.. In this example it would be ["name", "company", "email"]

Topic		Replies	Views
Multimatch query on non-analyzed field Elasticsearch	1	443	July 5, 2017
How does ES multi match query with a type cross_field do analsis of given string? Elasticsearch	1	345	July 6, 2017
Cross field extension to MultiMatch Elasticsearch	4	445	July 6, 2017
Multi_match query spanning multiple analyzer fields Elasticsearch	2	583	July 6, 2017
Search and Filter on Analyzed Fields Elasticsearch	7	1162	February 12, 2018

Multi match query with non fully analyzed fields

Related topics