ElasticSearch - Searching partial text in String

Sachin_Sharma · October 14, 2022, 2:42pm

What is the best way to use Elasticsearch to search exact partial text in String?

In SQL the method would be: %PARTIAL TEXT%, %ARTIAL TEX%

In Elastic Search current method being used:

{
    "query": {
        "match_phrase_prefix": {
             "name": "PARTIAL TEXT"
        }
    }
}

However, it breaks whenever you remove first and last character of string as shown below (No results found):

{
    "query": {
        "match_phrase_prefix": {
             "name": "ARTIAL TEX"
        }
    }
}

allan-silva · October 16, 2022, 3:33am

You probably is looking for a wildcard query.

Suppose a index like this, with one document:

PUT netflix_movie_title
{
    "mappings": {
        "properties": {
            "title": {
                "type": "keyword"
            }
        }
    }
}

POST netflix_movie_title/_doc
{
  "title": "The Hitchhiker's Guide to the Galaxy"
}

A wildcard query lets you perform a "SQL LIKE" like query.

Match start:

POST netflix_movie_title/_search
{
  "query": {
    "wildcard": {
      "title": {
        "value": "The Hitchhiker's Guide to th*"
      }
    }
  }
}

Match end:

POST netflix_movie_title/_search
{
  "query": {
    "wildcard": {
      "title": {
        "value": "*er's Guide to the Galaxy"
      }
    }
  }
}

Match middle:

POST netflix_movie_title/_search
{
  "query": {
    "wildcard": {
      "title": {
        "value": "*er's Guide to the Gal*"
      }
    }
  }
}

No match:

POST netflix_movie_title/_search
{
  "query": {
    "wildcard": {
      "title": {
        "value": "*er's Gui o the Gal*"
      }
    }
  }
}

However, the keyword field is designed to do exact match queries. If you needs full text search capabilities too, consider use a multifield mapping (Field data types | Elasticsearch Guide [8.4] | Elastic).

Mark_Harwood1 · October 16, 2022, 8:54am

Running wildcard queries on keyword fields has two problems:

It wont work on large values
the search cost is linear with the number of unique values

That’s why the wildcard field was created and this blog gives the background. This too has shortcomings because the search cost is linear with the number of docs that hold a value that roughly matches the search.
There’s always some kind of performance trade off.

allan-silva · October 16, 2022, 11:59am

Totally agree, performance issues should be considered on this use case. The blog post above has a guide to choose data type. If you not sure what data type to use. A possible approach is create a new sample index to explore your data using the desired type, and use the reindex API to reindex part or the whole production index.

system · November 13, 2022, 12:00pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Search for parts of a string field Elasticsearch	7	750	March 17, 2020
Ctrl+F search behavior in elastic Elasticsearch	17	20	November 26, 2024
Elasticsearch across all fields with fuzzy and wildcard searching Elasticsearch	1	1145	April 9, 2018
Search character ( wild card ) Elasticsearch	8	661	April 29, 2020
Wildcards search in exact phrase in query_string search Elasticsearch	3	6262	July 6, 2017

ElasticSearch - Searching partial text in String

Related topics