Logstash keeps restarting and load CPU

Hello!

I have CentOS with Logstash + JDBC + Elasticsearch to sync data between MySQL database and Elasticsearch. It works great, but I have noticed that Logstash highly loads CPU. I read all topics here and found that Logstash keeps restarting, but I have not found the answer how to fix the problem.

So that's what I have in my var\log\logstash\logstash-plain.log

    [2018-04-24T10:16:32,085][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"netflow", :directory=>"/usr/share/logstash/modules/netflow/configuration"}
[2018-04-24T10:16:32,087][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"fb_apache", :directory=>"/usr/share/logstash/modules/fb_apache/configuration"}
[2018-04-24T10:16:32,675][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}
[2018-04-24T10:16:32,678][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://localhost:9200/, :path=>"/"}
[2018-04-24T10:16:32,749][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://localhost:9200/"}
[2018-04-24T10:16:32,796][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}
[2018-04-24T10:16:32,799][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>50001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"_all"=>{"enabled"=>true, "norms"=>false}, "dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date", "include_in_all"=>false}, "@version"=>{"type"=>"keyword", "include_in_all"=>false}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}
[2018-04-24T10:16:32,806][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//localhost"]}
[2018-04-24T10:16:32,810][INFO ][logstash.pipeline        ] Starting pipeline {"id"=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>1000}
[2018-04-24T10:16:33,066][INFO ][logstash.pipeline        ] Pipeline main started
[2018-04-24T10:16:33,304][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2018-04-24T10:16:34,402][INFO ][logstash.inputs.jdbc     ] (0.516000s) SELECT * FROM products
[2018-04-24T10:16:39,699][WARN ][logstash.agent           ] stopping pipeline {:id=>"main"}

And this is repeating through the hole log.

If I run top in shell i get this:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
13390 logstash  39  19 4205680 246084  13892 S 363.1  0.8   0:13.88 java

And PID always changes as process restarts.

363.1 - this is %CPU and it is in the top of the shell top command output.

Probably you know what can I do with that?

What does your jdbc input plugin configuration look like?

input {
    jdbc {
        jdbc_connection_string => "jdbc:mysql://localhost:3306/store"
        jdbc_user => "user"
	    jdbc_password => "password"
        jdbc_validate_connection => true
        jdbc_driver_library => "/usr/share/java/mysql-connector-java.jar"
        jdbc_driver_class => "com.mysql.jdbc.Driver"
		statement => "SELECT * FROM products"
    }
}
output {
    elasticsearch {
        #protocol => http
        index => "products"
        document_type => "product"
        document_id => "%{id}"
        hosts => "localhost"
    }
}

Unless you use the jdbc input's schedule option Logstash will shut down after the query has been processed once. Then systemd (or whatever you're running) will fire it up again.

Oh, thank you!

If somebody needs later there is an example how to use schedule option (from Jdbc input plugin | Logstash Reference [8.11] | Elastic)

input {
jdbc {
jdbc_driver_library => "mysql-connector-java-5.1.36-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/mydb"
jdbc_user => "mysql"
parameters => { "favorite_artist" => "Beethoven" }
schedule => "* * * * *"
statement => "SELECT * from songs where artist = :favorite_artist"
}
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.