Limitation of fields in grok for Logstash.conf

Is there any limitation on the maximum number of fields I can mention in a grok in logstash.conf? My message length is around 438 characters and having 41 fields in it. When I am adding another field in the grok, looks like Logstash is not able to pump the data to ElasticSearch. When I am killing the logstash, in the console, it shows the following. Any help is welcome.

The shutdown process appears to be stalled due to busy or blocked plugins. Check the logs for more information. {:level=>:error}
 {:level=>:warn, "INFLIGHT_EVENT_COUNT"=>{"input_to_filter"=>1, "total"=>1}, "STALLING_THREADS"=>{["LogStash::Filters::Grok", {"match"=>{"message"=>"%{NUMBER:siteid} %{IP:client} %{NUMBER:port} \\[%{HTTPDATE:timestamp}\\] %{DATA:txnid} %{WORD:txntype} %{WORD:resptopsp} %{WORD:bank} %{WORD:respcd} %{WORD:channel} %{NUMBER:imps} \\\"%{GREEDYDATA:stages}\\\" %{NUMBER:amount} \\\"%{GREEDYDATA:payerAddress}\\\" \\\"%{GREEDYDATA:payerBank}\\\" \\\"%{GREEDYDATA:payeeAddress}\\\" \\\"%{GREEDYDATA:payeeBank}\\\" \\\"%{GREEDYDATA:remitterRespcd}\\\" \\\"%{GREEDYDATA:beneficiaryRespcd}\\\" \\\"%{GREEDYDATA:remitterRevRespcd}\\\" \\\"%{GREEDYDATA:beneficiaryRevRespcd}\\\" \\\"%{GREEDYDATA:payerPsp}\\\" \\\"%{GREEDYDATA:payeePsp}\\\" \\\"%{GREEDYDATA:payerAuthResponseCode}\\\" \\\"%{GREEDYDATA:payeeAuthResponseCode}\\\" \\\"%{GREEDYDATA:payerSettlementAmount}\\\" \\\"%{GREEDYDATA:payeeSettlementAmount}\\\" \\\"%{GREEDYDATA:payerMobileNumber}\\\" \\\"%{GREEDYDATA:payeeMobileNumber}\\\" \\\"%{GREEDYDATA:debitRRN}\\\" \\\"%{GREEDYDATA:creditRRN}\\\" \\\"%{GREEDYDATA:custRef}\\\" \\\"%{GREEDYDATA:riskScore}\\\" \\\"%{GREEDYDATA:payerMcc}\\\" \\\"%{GREEDYDATA:payeeMcc}\\\" \\\"%{GREEDYDATA:payerCardIin}\\\" \\\"%{GREEDYDATA:payeeCardIin}\\\" \\\"%{GREEDYDATA:payerIfsc}\\\" \\\"%{GREEDYDATA:payeeIfsc}\\\" \\\"%{GREEDYDATA:payerMmid}\\\" \\\"%{GREEDYDATA:payeeMmid}\\\" \\\"%{GREEDYDATA:payerSeqNum}\\\" \\\"%{GREEDYDATA:payeeSeqNum}\\\""}}]=>[{"thread_id"=>32, "name"=>"|filterworker.0", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/jls-grok-0.11.2/lib/grok-pure.rb:177:in `match'"}]}}

It's the repeated GREEDYDATA patterns that are killing the filter, not the amount of fields. You don't need it here since each such string is delimited by a double quote so "(?<stages>[^"]*)" can be used instead of "%{GREEDYDATA:stages}". You should also consider the QUOTEDSTRING pattern which handles escaped double quotes inside the strings.

It's rarely a good idea to use more than one DATA or GREEDYDATA pattern in a single expression.

2 Likes