Issues trying to search with ngram tokenizer

nealdan0 · April 12, 2021, 7:55pm

Hi, Recently we upgraded our version of elastic search to 7.10. Since then I've had issues querying with an ngram tokenizer from the java api. Specifically, searching for a text value only returns results if using as phrase_prefix query type. So, for example a query for 'abcd' will return results for 'abcdefg', as expected. But a search for 'bcdefg' will return no results.

I can work around it, reluctantly, with wild card queries. However I also have a need to search for non-alphanumeric characters, like -, %, / etc. I have not been able to make wild card queries work with those characters.

Analyzer

{
"max_ngram_diff": "7",
"analysis": {
"analyzer": {
"rp_analyzer": {
"type": "custom",
"tokenizer": "rp_tokenizer",
"filter": [
"lowercase"
]
}
},
"tokenizer": {
"rp_tokenizer": {
"type": "ngram",
"min_gram": 3,
"max_gram": 10,
"token_chars": [
"letter",
"digit",
"punctuation",
"symbol",
"custom"
],
"custom_token_chars":":+-_/%*?"
}
}
}
}

**Query **

MultiMatchQueryBuilder multiMatchQueryBuilder = new MultiMatchQueryBuilder(searchText.trim());
for (String fieldName :searchableFieldNames) {
multiMatchQueryBuilder.field(fieldName);
}
multiMatchQueryBuilder.operator(Operator.OR);
multiMatchQueryBuilder.slop(5);
multiMatchQueryBuilder.type(MultiMatchQueryBuilder.Type.PHRASE_PREFIX);
return multiMatchQueryBuilder;

Document set up (also tried fieldType.Text)

@Document(indexName = "reviewable_product")
@Setting(settingPath = "/elasticsearch/settings/ReviewableProductAnalyzer.json")
public class ReviewableProduct
@Field(type = FieldType.Keyword, analyzer = "rp_analyzer", searchAnalyzer="rp_analyzer") private String reviewableProductId;
@Field(type=FieldType.Keyword, analyzer="rp_analyzer" , searchAnalyzer="rp_analyzer", name="productName") private String productName;

I appreciate any help. Thanks all.

nealdan0 · April 14, 2021, 12:00pm

Figured it out. I was not creating the index properly, so the ngrams were never built.

system · May 12, 2021, 12:00pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
nGrams and Wildcards Elasticsearch	2	443	July 6, 2017
nGram and wildcards Elasticsearch	4	1591	July 6, 2017
Query string query ngrams and wildcards or fuzziness or proximity searches Elasticsearch	7	3299	December 23, 2017
Phrase matching using query_string on nGram analyzed data Elasticsearch	4	1596	July 6, 2017
What search to use? Elasticsearch	7	753	July 5, 2017

Issues trying to search with ngram tokenizer

Related topics