Hello!
I have CentOS with Logstash + JDBC + Elasticsearch to sync data between MySQL database and Elasticsearch. It works great, but I have noticed that Logstash highly loads CPU. I read all topics here and found that Logstash keeps restarting, but I have not found the answer how to fix the problem.
So that's what I have in my var\log\logstash\logstash-plain.log
[2018-04-24T10:16:32,085][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"netflow", :directory=>"/usr/share/logstash/modules/netflow/configuration"}
[2018-04-24T10:16:32,087][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"fb_apache", :directory=>"/usr/share/logstash/modules/fb_apache/configuration"}
[2018-04-24T10:16:32,675][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}
[2018-04-24T10:16:32,678][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://localhost:9200/, :path=>"/"}
[2018-04-24T10:16:32,749][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://localhost:9200/"}
[2018-04-24T10:16:32,796][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}
[2018-04-24T10:16:32,799][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>50001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"_all"=>{"enabled"=>true, "norms"=>false}, "dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date", "include_in_all"=>false}, "@version"=>{"type"=>"keyword", "include_in_all"=>false}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}
[2018-04-24T10:16:32,806][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//localhost"]}
[2018-04-24T10:16:32,810][INFO ][logstash.pipeline ] Starting pipeline {"id"=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>1000}
[2018-04-24T10:16:33,066][INFO ][logstash.pipeline ] Pipeline main started
[2018-04-24T10:16:33,304][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
[2018-04-24T10:16:34,402][INFO ][logstash.inputs.jdbc ] (0.516000s) SELECT * FROM products
[2018-04-24T10:16:39,699][WARN ][logstash.agent ] stopping pipeline {:id=>"main"}
And this is repeating through the hole log.
If I run top in shell i get this:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13390 logstash 39 19 4205680 246084 13892 S 363.1 0.8 0:13.88 java
And PID always changes as process restarts.
363.1 - this is %CPU and it is in the top of the shell top command output.
Probably you know what can I do with that?