We are running filebeat,logstash, kibana and elasticsearach in Master machine(1 machine) and from slave machines(6 machines). We are just running Filebeat and all the data from those machines is transferred to master machine.
We are facing below two error in Logstash console
We are thinking that the error occurred because we are loading data from 7 server (almost around 300 log files) and CPU utilization is reaching upto 100%
After the timeout error logstash stop working and data parsing and inserting in elasticsearch stop.
Errors:
2017-06-28T03:04:53,176][ERROR][logstash.filters.grok ] Error while attempti
g to check/cancel excessively long grok patterns {:message=>"Mutex relocking by
same thread", :class=>"ThreadError", :backtrace=>["org/jruby/ext/thread/Mutex.j
va:90:in lock'", "org/jruby/ext/thread/Mutex.java:147:insynchronize'", "G:/E
K_Softwares/exe/logstash-5.0.0/vendor/bundle/jruby/1.9/gems/logstash-filter-gro
-3.2.3/lib/logstash/filters/grok/timeout_enforcer.rb:38:in stop_thread_groking ", "G:/ELK_Softwares/exe/logstash-5.0.0/vendor/bundle/jruby/1.9/gems/logstash-f lter-grok-3.2.3/lib/logstash/filters/grok/timeout_enforcer.rb:53:incancel_tim
d_out!'", "org/jruby/RubyHash.java:1342:in each'", "G:/ELK_Softwares/exe/logst sh-5.0.0/vendor/bundle/jruby/1.9/gems/logstash-filter-grok-3.2.3/lib/logstash/f lters/grok/timeout_enforcer.rb:45:incancel_timed_out!'", "org/jruby/ext/threa
/Mutex.java:149:in synchronize'", "G:/ELK_Softwares/exe/logstash-5.0.0/vendor/ undle/jruby/1.9/gems/logstash-filter-grok-3.2.3/lib/logstash/filters/grok/timeo t_enforcer.rb:44:incancel_timed_out!'", "G:/ELK_Softwares/exe/logstash-5.0.0/
endor/bundle/jruby/1.9/gems/logstash-filter-grok-3.2.3/lib/logstash/filters/gro
/timeout_enforcer.rb:63:in `start!'"]}
due to size constraint we are sending output seperately
output {
if "trace" in [type]{
elasticsearch {
hosts => ["143.22.209.122"]
index => "sap-trace-app-logs-nprd-qa"
}
}
if "trc-prd" in [type]{
elasticsearch {
hosts => ["143.22.209.122"]
index => "sap-trace-app-logs-prd"
}
}
if "application" in [type]{
elasticsearch {
hosts => ["143.22.209.122"]
index => "sap-trace-app-logs-nprd-qa"
}
}
if "appl-prd" in [type]{
elasticsearch {
hosts => ["143.22.209.122"]
index => "sap-trace-app-logs-prd"
}
}
if "security" in [type]{
elasticsearch {
hosts => ["143.22.209.122"]
index => "sap-security-logs-nprd-qa"
}
}
if "sec-prd" in [type]{
elasticsearch {
hosts => ["143.22.209.122"]
index => "sap-security-logs-prd"
}
}
match => {"message" => "(?m)#%{DATA:m1}#%{DATA:DateTime}#%{DATA:Timezone}#%{DATA:Severity}#%{DATA:Category}#%{DATA:m6}#%{DATA:CustomerMessageComponent}#%{DATA:RuntimeComponent}#%{DATA:LogID}#%{DATA:CorrelationID}#%{DATA:Application}#%{DATA:Location}#%{DATA:User}#%{DATA:Session}#%{DATA:m2}#%{DATA:PassportSession}#%{DATA:PassportUserActivityID}#%{DATA:PassportConnection}#%{DATA:PassportConnectionCounter}#%{DATA:Thread}#%{DATA:m4}#%{DATA:m5}#%{GREEDYDATA:ErrorMessage}#"}
This grok expression is insanely inefficient. You really need to reduce the number of DATA and GREEDYDATA patterns. I'm positive that it's possible to construct more exact expressions that aren't so expensive to process.
Or, since this appears to be a log with #-separated records, just use a csv filter.
only {GREEDYDATA:ErrorMessage} this field is loading too much data.It gives me error:
create"=>{"_index"=>"sap-trace-app-logs-k8p", "_type"=>"trc-prd", "
_id"=>"AV02Zi5YHAmgJsrjBUjd", "status"=>400, "error"=>{"type"=>"illegal_argument exception", "reason"=>"Document contains at least one immense term in field="p
rd_ErrorMessage" (whose UTF8 encoding is longer than the max length 32766), all
of which were skipped. Please correct the analyzer to not produce such terms.
The prefix of the first immense term is: '[10, 83, 105, 110, 103, 108, 101, 32,
67, 108, 105, 99, 107, 32, 65, 112, 112, 114, 111, 118, 97, 108, 32, 58, 32, 10
9, 80, 111, 115, 116]...', original message: bytes can be at most 32766 in lengt
h; got 106157", "caused_by"=>{"type"=>"max_bytes_length_exceeded_exception", "re
ason"=>"bytes can be at most 32766 in length; got 106157"}}}}}
2.I applied "ignore_above": 256 to this field in the index which resolves my timeout error but I don't want to loose data.
ignore_above": 256 will ignore the whole record.
3.I can see my CPU % is also reaches to 100%.
4.And while loading dashboard ion kibana I am facing below error.
RemoteTransportException[[Corona][143.22.209.122:9300][indices:data/read/search[
phase/query]]]; nested: CircuitBreakingException[[request] Data too large, data
for [] would be larger than limit of [381891379/364.1mb]];
Caused by: CircuitBreakingException[[request] Data too large, data for [<reused
arrays>] would be larger than limit of [381891379/364.1mb]]
5.Getting Visualize Request Timeout after 30000ms error while loading dashboard.
I think all these are related to that field .Is there way I can resolve this without loosing data.
}
filter
{
if "trace" in [type]{
grok {
match => {"message" => "(?m)#%{DATA:m1}#%{DATA:DateTime}#%{DATA:Timezone}#%{DATA:Severity}#%{DATA:Category}#%{DATA:m6}#%{DATA:CustomerMessageComponent}#%{DATA:RuntimeComponent}#%{DATA:LogID}#%{DATA:CorrelationID}#%{DATA:Application}#%{DATA:Location}#%{DATA:User}#%{DATA:Session}#%{DATA:m2}#%{DATA:PassportSession}#%{DATA:PassportUserActivityID}#%{DATA:PassportConnection}#%{DATA:PassportConnectionCounter}#%{DATA:Thread}#%{DATA:m4}#%{DATA:m5}#%{GREEDYDATA:ErrorMessage}#"}
}
mutate
{ #remove_tag => ["multiline"],
remove_field => ["m1","m4","m6"]
strip => ["DateTime"]
}
date {
match => [ "DateTime", "YYYY MM dd HH:mm:ss:SSS" ]
timezone => "EST"
target => "DateTime"
}
}
if "application" in [type]{
grok {
match => {"message" => "(?m)#%{DATA:m1}#%{DATA:DateTime}#%{DATA:Timezone}#%{DATA:Severity}#%{DATA:Category}#%{DATA:m6}#%{DATA:CustomerMessageComponent}#%{DATA:RuntimeComponent}#%{DATA:LogID}#%{DATA:CorrelationID}#%{DATA:Application}#%{DATA:Location}#%{DATA:User}#%{DATA:Session}#%{DATA:m2}#%{DATA:PassportSession}#%{DATA:PassportUserActivityID}#%{DATA:PassportConnection}#%{DATA:PassportConnectionCounter}#%{DATA:Thread}#%{DATA:m4}#%{DATA:m5}#%{GREEDYDATA:ErrorMessage}#"}
}
mutate
{ #remove_tag => ["multiline"],
remove_field => ["m1","m4","m6"]
strip => ["DateTime"]
}
date {
match => [ "DateTime", "YYYY MM dd HH:mm:ss:SSS" ]
timezone => "EST"
target => "DateTime"
}
}
if "security" in [type]{
grok {
match => {"message" => "(?m)#%{DATA:security_m1}#%{DATA:security_DateTime}#%{DATA:security_Timezone}#%{DATA:security_Severity}#%{DATA:security_Category}#%{DATA:security_m6}#%{DATA:security_CustomerMessageComponent}#%{DATA:security_RuntimeComponent}#%{DATA:security_LogID}#%{DATA:security_CorrelationID}#%{DATA:security_Application}#%{DATA:security_Location}#%{DATA:security_User}#%{DATA:security_Session}#%{DATA:security_m2}#%{DATA:security_PassportSession}#%{DATA:security_PassportUserActivityID}#%{DATA:security_PassportConnection}#%{DATA:security_PassportConnectionCounter}#%{DATA:security_Thread}#%{DATA:security_m4}#%{DATA:security_m5}#%{GREEDYDATA:security_RemainingMessage}"}
}
if "LOGIN." in [security_RemainingMessage] {
grok {
match => {"security_RemainingMessage" => "%{DATA:security_ErrorMessage}\nUser: %{DATA:security_LoginUser}\nIP Address: %{DATA:security_IPAddress}\n%{GREEDYDATA:security_FinalRemainingMessage}"}
}
}
else {
grok {
match => {"security_RemainingMessage" => "%{DATA:security_ErrorMessage}#%{GREEDYDATA:security_FinalRemainingMessage}"}
}
}
mutate
{
#remove_tag => ["multiline"],
remove_field => ["security_m1","security_m4","security_m6","security_FinalRemainingMessage","security_RemainingMessage"]
rename => {"host" => "hostname"}
strip => ["security_DateTime"]
}
date {
match => [ "security_DateTime", "YYYY MM dd HH:mm:ss:SSS" ]
timezone => "EST"
target => "security_DateTime"
}
}
}
output {
if "trace" in [type]{
elasticsearch {
hosts => ["143.22.209.122"]
index => "test"
}
}
if "application" in [type]{
elasticsearch {
hosts => ["143.22.209.122"]
index => "test"
}
}
if "security" in [type]{
elasticsearch {
hosts => ["143.22.209.122"]
index => "security-test"
}
}
Did you understand anything of what I wrote last time? You still have too many DATA and GREEDYDATA patterns in your grok filters. You need to fix that. Over and out.
output {
if "trace" in [type]{
elasticsearch {
hosts => ["10.103.20.64"]
index => "csvconfigfour"
} } }
but still I am not getting data in correct field.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.