Exact match on fields of type "text" (beginning and end "anchored")

ypid-geberit · June 24, 2020, 10:30am

Hi

Is there a way to do an exact match on fields of type "text"? I mean ensure that all tokens are matched in the order specified and no additional tokens are contained in search result documents (no partial match)? I know the "keyword" type could help here, but that is not wanted/available here. I already searched a bit and found things like https://stackoverflow.com/questions/30517904/elasticsearch-exact-matches-on-analyzed-fields and Exact match search on text field. There must be some trick do do this. Am I missing something?

Consider this example. Only two results should be returned. This example returns all three documents. A solution with Query DSL is also fine.

PUT ypid-exact-match-of-text-test
{
  "mappings": {
    "properties": {
      "text": {
        "type": "text"
      }
    }
  }
}
POST ypid-exact-match-of-text-test/_doc
{
  "text": "You Know, for Search"
}
POST ypid-exact-match-of-text-test/_doc
{
  "text": "Elastic Stack"
}
POST ypid-exact-match-of-text-test/_doc
{
  "text": "You Know, for Search (Elastic Stack)"
}
GET ypid-exact-match-of-text-test/_search?filter_path=hits.hits._source.text
{
  "query": {
    "query_string": {
      "query": """text:("You Know, for Search" OR "Elastic Stack")"""
    }
  }
}

I also tried to use min_score which would have been a workaround, but the default scoring is not suitable for that. It seems even with Painless, it is not trivial to do this as Painless would also benefit from type keyword.

jessepeixoto · June 28, 2020, 9:03pm

Hi @ypid-geberit,

I think a solution for your use case would be by using a normalizer that is a kind of analyzer for keyword, and it produces a single token at the end. https://www.elastic.co/guide/en/elasticsearch/reference/current/normalizer.html

If you really don't want to use keyword for your use case, in the book Relevant Search by Doug Turnbull and John Berryman, they proposes an interesting solution for that question that are the sentinel tokens. You add tokens in the boundaries of the text even in the ingest as in the search part.

PUT ypid-exact-match-of-text-test
{
  "mappings": {
    "properties": {
      "text": {
        "type": "text"
      }
    }
  }
}

POST ypid-exact-match-of-text-test/_doc
{
  "text": "SENTINEL_BEGIN You Know, for Search SENTINEL_END"
}

POST ypid-exact-match-of-text-test/_doc
{
  "text": "SENTINEL_BEGIN Elastic Stack SENTINEL_END"
}

POST ypid-exact-match-of-text-test/_doc
{
  "text": "SENTINEL_BEGIN You Know, for Search (Elastic Stack) SENTINEL_END"
}

The search part would also have the sentinel tokens. In order to match exactly the phase you would have to use match phase query along with boolean queries, like the example below.

GET ypid-exact-match-of-text-test/_search?filter_path=hits.hits._source.text
{
  "query": {
    "bool": {
      "should": [
        {
          "match_phrase": {
            "text": "SENTINEL_BEGIN You Know, for Search SENTINEL_END"
          }
        },
        {
          "match_phrase": {
            "text": "SENTINEL_BEGIN Elastic Stack SENTINEL_END"
          }
        }
      ]
    }
  }
}

ypid-geberit · June 29, 2020, 9:13am

Thanks for the clarification. So I am not missing anything. Some form of preparation at index time is needed so that this query can be answered. The trick with sentinel tokens is interesting but probably not a good idea to do by default for the typical logging use case (where Kibana is used).

system · July 27, 2020, 9:14am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Exact match search on text field Elasticsearch	2	572	June 26, 2018
Help Getting Exact matches on text field Elasticsearch	2	442	June 26, 2020
Performing an exact match on a field in a document Elasticsearch	3	841	July 6, 2017
Simple equality filter Elasticsearch	3	641	July 6, 2017
How to do exact match Elasticsearch	2	1062	July 6, 2017

Exact match on fields of type "text" (beginning and end "anchored")

Related topics