How to know which documents in the search results include all of the tokens in the search query (for one particular field)

swafa · June 18, 2020, 1:06pm

We are trying to find out which documents in our search results include all the tokens in our search query. For example when running the following query:

    GET /test/_search
    {
        "query" : {
          "match" : { "message" : { "query": "the brown fox" } }
        }
    }

We would need a response like this (only the hits array is included with minimal fields for brevity):

    "hits" : [
      {
        **"all_tokens_match": true**,
        "_source" : {
          "message" : "The Quick Brown Fox"
        }
      },
      {
        **"all_tokens_match": false**,
        "_source" : {
          "message" : "The Brown Bear"
        }
      }
    ]

How would you recommend we approach this kind of problem on Elasticsearch? We were looking into scripting but our field is usually indexed as a "text" and not as a "keyword" field and there seems to be some limitations with scripting in that case. In case you recommend scripting, we would be grateful for a snippet to guide us in the right direction. Thank you in advance for your support.

mayya · June 18, 2020, 9:07pm

Check if named queries will help your problem.
If you use named queries, you would need to break your tokens into distinct queries, something like this:

GET /_search
{
    "query": {
        "bool" : {
            "should" : [
                {"match" : { "message" : {"query" : "the", "_name" : "my_query1"} }},
                {"match" : { "message" : {"query" : "brown", "_name" : "my_query2"} }},
                {"match" : { "message" : {"query" : "fox", "_name" : "my_query3"} }}
            ]
        }
    }
}

as a result you will get the following:

"hits" : [
      {
        "_index" : "my_index",
       ....
        "_source" : {
          "message" : "The Quick Brown Fox"
        },
        "matched_queries" : [
          "my_quer1",
          "my_query2",
          "my_query3"
        ]
      },
      {
        "_index" : "my_index",
          ...
        "_source" : {
          "message" : "The Brown Bear"
        },
        "matched_queries" : [
          "my_query1",
          "my_query2"
        ]
      },

swafa · June 23, 2020, 12:37pm

Thank you @mayya for guiding us in the right direction of named queries. It helped us tackle our problem after we combined named queries with the "minimum_should_match" parameter and set "boost": 0 to avoid it affecting the score of our main query.

We did it like this:

{
    "query": {
        "bool": {
            "should": [
                {
                    "match": {
                        "message": {
                            "query": "the brown fox",
                            "_name": "main_query"
                        }
                    }
                },
                {
                    "match": {
                        "message": {
                            "query": "the brown fox",
                            "_name": "all_tokens_match",
                            "minimum_should_match": "100%",
                            "boost": 0
                        }
                    }
                }
            ]
        }
    }
}

So we receive a response like this:

{
    "hits": [
        {
            "_score": 0.99938476,
            "_source": {
                "message": "The Quick Brown Fox"
            },
            "matched_queries": [
                "main_query",
                "all_tokens_match"
            ]
        },
        {
            "_score": 0.38727614,
            "_source": {
                "message": "The Brown Bear"
            },
            "matched_queries": [
                "main_query"
            ]
        }
    ]
}

Documents that all tokens in our query match now have all_tokens_match included in the matched_queries part of the response.

system · July 21, 2020, 12:38pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to find out how a query is matched? Elasticsearch	3	366	February 7, 2019
Only match if all tokens of an indexed field are included in the search query in any order Elasticsearch	2	881	June 22, 2022
Elasticsearch query to match all tokens inside a specific field Elasticsearch kql-kibana-query-language	1	597	July 24, 2023
Searching with multiple tokens in the query Elasticsearch	2	949	July 6, 2017
Is it possible to count tokens which matched documents? Elasticsearch	4	753	April 14, 2021

How to know which documents in the search results include all of the tokens in the search query (for one particular field)

Related topics