Thanks for providing some more details. I'm afraid I can't think of any better solution off the top of my head than reevaluating the requirements and fine-tuning your analyzer and index mapping accordingly.
For example, could you loosen the requirement for matching 1-2 characters or matching >10 chars? If the field is full text (and so it contains words separated by spaces and punctuation), I find it unlikely that someone would want to type in 10+ chars and still continue typing to narrow down search results.
Alternatively you can define a mapping where description is analyzed into grams of 3..10 chars, and the other 4 fields with 1..10 chars, then combine the search with a bool query, something like below. This will support hybrid matching while limiting memory usage. Note this analyzer strips the spaces and punctuation, and breaks up the text on a word by word basis.
PUT ngram_test
{
"settings": {
"index.max_ngram_diff": 10, # Allow large difference between min/max_ngram
"analysis": {
"tokenizer": {
"title_ngram_tokenizer": {
"type": "ngram",
"min_gram": 1,
"max_gram": 10
},
"description_ngram_tokenizer": {
"type": "ngram",
"min_gram": 3,
"max_gram": 10
}
},
"analyzer": {
"title_analyzer": {
"filter" : ["lowercase", "asciifolding"],
"tokenizer": "title_ngram_tokenizer"
},
"description_analyzer": {
"filter" : ["lowercase", "asciifolding"],
"tokenizer": "description_ngram_tokenizer"
}
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"fields": {
"ngram": {
"type": "text",
"analyzer": "title_analyzer"
}
}
},
"description": {
"type": "text",
"fields": {
"ngram": {
"type": "text",
"analyzer": "description_analyzer"
}
}
},
"status": {
"type": "text",
"fields": {
"ngram": {
"type": "text",
"analyzer": "title_analyzer"
}
}
}
}
}
}
Query:
POST ngram_test/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"title.ngram": {
"query": "abc",
"operator": "AND" # All ngrams of the query should match ngrams of the field
}
}
},
{
"match": {
"description.ngram": {
"query": "abc",
"operator": "AND"
}
}
},
{
"match": {
"status.ngram": {
"query": "abc",
"operator": "AND"
}
}
}
], # Finds 3-10 char fragments in all fields; finds 1-2 char fragments in fields other than description
"minimum_should_match": 1
}
}
}
In any case I think the key to improving performance is ensuring that either the number of ngrams generated or the length of input to the ngrams is kept within certain boundaries.
I hope this helps!