Hi,
I need some guidiance to speed up the starting of logstash - or at least I would like to know why it takes so long and if that is really ok.
[2018-03-27T07:18:48,795][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://139.1.117.41:19200/"}
[2018-03-27T07:18:48,804][INFO ][logstash.outputs.elasticsearch] ES Output version determined {:es_version=>6}
[2018-03-27T07:18:48,804][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>6}
[2018-03-27T07:18:48,810][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}
[2018-03-27T07:18:48,811][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}
[2018-03-27T07:18:48,815][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//139.1.117.41:19200"]}
[2018-03-27T07:18:48,939][INFO ][logstash.inputs.redis ] Registering Redis {:identity=>"redis://@139.1.117.41:6379/0 list:tapdispatcher"}
[2018-03-27T07:18:49,574][INFO ][logstash.pipeline ] Pipeline started succesfully {:pipeline_id=>"tapdispatcher", :thread=>"#<Thread:0x7f00ea0e run>"}
[2018-03-27T07:19:18,429][INFO ][logstash.pipeline ] Starting pipeline {:pipeline_id=>"tmadmin_usercount", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2018-03-27T07:19:18,466][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://139.1.117.41:19200/]}}
[2018-03-27T07:19:18,467][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://139.1.117.41:19200/, :path=>"/"}
[2018-03-27T07:19:18,472][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://139.1.117.41:19200/"}
[2018-03-27T07:19:18,755][INFO ][logstash.outputs.elasticsearch] ES Output version determined {:es_version=>6}
[2018-03-27T07:19:18,756][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>6}
[2018-03-27T07:19:19,015][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}
[2018-03-27T07:19:19,017][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}
I have a redis instance as message broker. Each logfile type is mapped to a redis key (list).
Logstash has multiple pipelines (one per key).
It takes about 30 seconds to start up a pipeline. The interesting thing is that the time is lost beween the loglines started and starting, so a delay after pipeline n has finished starting and pipeline n+1 begins to start.
Running on Redhat 7 in docker. But I have the same issue when running it classic outside of docker.