Elastic Heap and GC count increases when searching with thousands of characters

Hi Everyone,
Having an issue when user searches with thousands of characters

Steps to reproduce :

  1. Enter 5K junk characters like '***************' or '........................' and do multiple searches, after that elastic search started failing and could see spike in heap and GC count.

As an solution, going to limit the allowed characters length and before that
wanted to understand why this kind of search impacts ES, what can we do to stop it and what is the better character length limit?

Note: We are upgrading to ES 6.7 next month.

Elasticsearch version ( 1.5 ):

Plugins installed : [none]

JVM version ( java version "1.8.0_71" Java(TM) SE Runtime Environment (build 1.8.0_71-b15) Java HotSpot(TM) 64-Bit Server VM (build 25.71-b15, mixed mode) ):

OS version ( CentOS Linux 7 (Core) ):

Provide logs (if relevant) :

Template settings used:

"settings":{"index.analysis.analyzer.edge_analyzer.tokenizer":"edge_tokenizer","index.analysis.analyzer.folding_analyzer.tokenizer":"standard","index.analysis.tokenizer.edge_tokenizer.min_gram":"2","index.analysis.analyzer.folding_analyzer.filter.0":"lowercase","index.analysis.analyzer.folding_analyzer.filter.1":"asciifolding","index.number_of_replicas":"0","index.analysis.tokenizer.edge_tokenizer.max_gram":"32","index.analysis.analyzer.sort_analyzer.filter.0":"lowercase","index.analysis.tokenizer.edge_tokenizer.token_chars.1":"digit","index.analysis.analyzer.sort_analyzer.filter.1":"asciifolding","index.analysis.tokenizer.edge_tokenizer.token_chars.0":"letter","index.analysis.tokenizer.edge_tokenizer.type":"edgeNGram","index.analysis.analyzer.sort_analyzer.tokenizer":"keyword","index.analysis.analyzer.edge_analyzer.filter.0":"lowercase","index.number_of_shards":"1","index.analysis.analyzer.edge_analyzer.filter.1":"asciifolding"}

Thanks!!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.