ElasticSearch Keyword usage with a prefix search

I have a requirement to be able to search a sentence as complete or with prefix. The UI library (reactive search) I am using is generating the query in this way:

"simple_query_string": {
  "query": "\"Louis George Maurice Adolphe\"",
  "fields": [
    "field1",
    "field2",    
    "field3"
  ],
  "default_operator": "or"
}

I am expecting it to returns results for eg.
Louis George Maurice Adolphe (Roche)
but NOT just records containing partial terms like Louis or George

Currently, I have code like this but it only brings the record if I search with complete word Louis George Maurice Adolphe (Roche) but not a prefix Louis George Maurice Adolphe.

{
  "settings": {
    "analysis": {
      "char_filter": {
        "space_remover": {
          "type": "mapping",
          "mappings": [
            "\\u0020=>"
          ]
        }
      },
      "normalizer": {
        "lower_case_normalizer": {
          "type": "custom",
          "char_filter": [
            "space_remover"
          ],
          "filter": [
            "lowercase"
          ]
        }
      }
    }
  },
  "mappings": {
    "_doc": {
      "properties": {
        "field3": {
          "type": "keyword",
          "normalizer": "lower_case_normalizer"
        }
      }
    }
  }
}

Any guidance on the above is appreciated. Thanks.

Note: The issue here is the user queries it through a search box and the library generates the simple_query_string . The user's search term can match any of the fields field1, field2, field3 or field4. How can I change the query by still maintaining this functionality? And all the fields can be linked to different analyzers

I'd have said use a Prefix query, but it sounds like you can't change the query type the UI library is producing. Maybe it has an option to issue a different query?

Otherwise you might want to look at the wildcard field type. Its full-length matches (keyword-style) are a bit slower than if you used a keyword field, but they work - plus it enables the efficient use of wildcard queries, including wildcards in the simple query string query. Only thing is, it just got introduced in 7.9 so you might need to upgrade.

This is what I ran:

PUT index1
{
  "settings": {
    "analysis": {
      "char_filter": {
        "space_remover": {
          "type": "mapping",
          "mappings": [
            "\\u0020=>"
          ]
        }
      },
      "normalizer": {
        "lower_case_normalizer": {
          "type": "custom",
          "char_filter": [
            "space_remover"
          ],
          "filter": [
            "lowercase"
          ]
        }
      }
    }
  },
  "mappings": {
      "properties": {
        "field3": {
          "type": "keyword",
          "normalizer": "lower_case_normalizer"
        },
        "field3_wildcard": {
          "type": "wildcard"
        }
      }
  }
}

So far, so good, I just introduce a wildcard field alongside your keyword field so we can play more easily.

POST index1/_doc
{
  "field3": "Louis George Maurice Adolphe (Roche)",
  "field3_wildcard": "Louis George Maurice Adolphe (Roche)"
}

GET index1/_search if you want to check it's been indexed.

POST index1/_search
{
  "query": {
    "simple_query_string": {
      "query": "\"Louis George Maurice Adolphe*\"",
      "fields": [  
        "field3_wildcard"
      ],
      "default_operator": "or"
    }
  }
}

That finds the document. Only thing I've changed is to add the * wildcard at the end, to indicate Louis George Maurice Adolphe is a prefix. This doesn't work with a keyword field because that needs an exact full match. "query": "\"Louis George Maurice Adolphe (Roche)\"", (exact match) also works with wildcard.

I think this fulfils

I have a requirement to be able to search a sentence as complete or with prefix

Though it requires *s added to your queries.

Hi Emanuil,

This sounds like a reasonable solution. I will try it with my data.

Thank you very much

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.