Logstash not ingesting fast moving log

Hi All,

We have a very fast moving/ rolling log file on our Linux server. This log moves very quickly and a job gzips it every hour.

I enabled filebeat output to console using the following in filebeat.yml:

output.console:
   pretty: true

On the console I could see relevant messages flowing out of filebeat very fast and going to logstash.

I then enabled debug on the logstash side and tried to catch if any messages related to that particular feed (proxy_statslog) are getting processed. I see repeated occurrences of the following message but an index is not getting created in ES:

[2024-06-20T16:59:08,129][DEBUG][org.logstash.config.ir.CompiledPipeline][main] Compiled output
 P[output-elasticsearch{"hosts"=>["https://xxx.xx.xx.x:43044", "https://xxx.xx.xx.x:43044", "https://xxx.xx.xx.xx:43044", "https://xxx.xx.xx.xx:43044"], "ssl"=>"true", "cacert"=>"/opt/total/elasticsearch/config/certs/http_ca.crt", "user"=>"elastic", "password"=>"*****************", "index"=>"proxy_statslog-%{+YYYY.MM.dd}", "document_type"=>"%{[@metadata][type]}"}|[file]/opt/total/logstash/config/mergedlogstash_uat.conf:2530:5:```
elasticsearch {
     hosts => [ "https://xxx.xx.xx.xx:43044" ]
     hosts => [ "https://xxx.xx.xx.xx:43044" ]
     hosts => [ "https://xxx.xx.xx.xx:43044" ]
     hosts => [ "https://xxx.xx.xx.xx:43044" ]
     ssl => true
     cacert => "/opt/total/elasticsearch/config/certs/http_ca.crt"
     user => "elastic"
     password => "*****************"
     index => "proxy_statslog-%{+YYYY.MM.dd}"
      document_type => "%{[@metadata][type]}"
   }
```]

Please help in finding the cause of this.

Thanks

That message is a DEBUG message indicating an elasticsearch output was configured. If it is repeatedly being logged it might suggest that logstash is restarting.

Can you show us your logstash configuration and more of the logstash log?

Thanks. Shall I keep the DEBUG enabled? It creates a large log file?

I would start without debug, but it may be needed later. If the log file is large you can use a file sharing site instead of posting it in this thread.

Log is reproduced below. Apologies if its too big:

Using bundled JDK: /opt/total/logstash/jdk
Sending Logstash logs to /opt/total/logstash/logs which is now configured via log4j2.properties
[2024-06-20T19:00:56,814][INFO ][logstash.runner          ] Log4j configuration path used is: /opt/total/logstash/config/log4j2.properties
[2024-06-20T19:00:56,821][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"8.6.2", "jruby.version"=>"jruby 9.3.10.0 (2.6.8) 2023-02-01 107b2e6697 OpenJDK 64-Bit Server VM 17.0.6+10 on 17.0.6+10 +indy +jit [x86_64-linux]"}
[2024-06-20T19:00:56,824][INFO ][logstash.runner          ] JVM bootstrap flags: [-Xms6g, -Xmx6g, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djruby.compile.invokedynamic=true, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=file:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true, -Djruby.regexp.interruptible=true, -Djdk.io.File.enableADS=true, --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED, --add-opens=java.base/java.security=ALL-UNNAMED, --add-opens=java.base/java.io=ALL-UNNAMED, --add-opens=java.base/java.nio.channels=ALL-UNNAMED, --add-opens=java.base/sun.nio.ch=ALL-UNNAMED, --add-opens=java.management/sun.management=ALL-UNNAMED]
[2024-06-20T19:00:57,017][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2024-06-20T19:00:57,630][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
[2024-06-20T19:01:05,829][INFO ][org.reflections.Reflections] Reflections took 100 ms to scan 1 urls, producing 127 keys and 444 values
[2024-06-20T19:01:08,847][WARN ][logstash.outputs.elasticsearch] You are using a deprecated config setting "document_type" set in elasticsearch. Deprecated settings will continue to work, but are scheduled for removal from logstash in the future. Document types are being deprecated in Elasticsearch 6.0, and removed entirely in 7.0. You should avoid this feature If you have any questions about this, please visit the #logstash channel on freenode irc. {:name=>"document_type", :plugin=><LogStash::Outputs::ElasticSearch password=><password>, hosts=>[https://xxx.xx.xx.xx:43044, https://xxx.xx.xx.xx:43044, https://xxx.xx.xx.xx:43044, https://xxx.xx.xx.xx:43044], cacert=>"/opt/total/elasticsearch/config/certs/http_ca.crt", index=>"proxy_statslog-%{+YYYY.MM.dd}", id=>"e9dff330011ca662aba25c47bc4235c701b0f294d909e8539e5cc0843f82e666", ssl=>true, user=>"elastic", document_type=>"%{[@metadata][type]}", enable_metric=>true, codec=><LogStash::Codecs::Plain id=>"plain_ab56725a-efe9-41a4-b32b-1be433b7a2a6", enable_metric=>true, charset=>"UTF-8">, workers=>1, ssl_certificate_verification=>true, sniffing=>false, sniffing_delay=>5, timeout=>60, pool_max=>1000, pool_max_per_route=>100, resurrect_delay=>5, validate_after_inactivity=>10000, http_compression=>false, retry_initial_interval=>2, retry_max_interval=>64, data_stream_type=>"logs", data_stream_dataset=>"generic", data_stream_namespace=>"default", data_stream_sync_fields=>true, data_stream_auto_routing=>true, manage_template=>true, template_overwrite=>false, template_api=>"auto", doc_as_upsert=>false, script_type=>"inline", script_lang=>"painless", script_var_name=>"event", scripted_upsert=>false, retry_on_conflict=>1, ilm_enabled=>"auto", ilm_pattern=>"{now/d}-000001", ilm_policy=>"logstash-policy", dlq_on_failed_indexname_interpolation=>true>}
[2024-06-20T19:01:08,917][INFO ][logstash.javapipeline    ] Pipeline `main` is configured with `pipeline.ecs_compatibility: v8` setting. All plugins in this pipeline will default to `ecs_compatibility => v8` unless explicitly configured otherwise.
[2024-06-20T19:01:08,928][INFO ][logstash.outputs.elasticsearch][main] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["https://xxx.xx.xx.xx:43044", "https://xxx.xx.xx.xx:43044", "https://xxx.xx.xx.xx:43044", "https://xxx.xx.xx.xx:43044"]}
[2024-06-20T19:01:09,026][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[https://elastic:xxxxxx@xxx.xx.xx.xx:43044/, https://elastic:xxxxxx@xxx.xx.xx.xx:43044/, https://elastic:xxxxxx@xxx.xx.xx.xx:43044/, https://elastic:xxxxxx@xxx.xx.xx.xx:43044/]}}
[2024-06-20T19:01:09,308][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"https://elastic:xxxxxx@xxx.xx.xx.xx:43044/"}
[2024-06-20T19:01:09,314][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch version determined (8.7.0) {:es_version=>8}
[2024-06-20T19:01:09,315][WARN ][logstash.outputs.elasticsearch][main] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>8}
[2024-06-20T19:01:09,371][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"https://elastic:xxxxxx@xxx.xx.xx.xx:43044/"}
[2024-06-20T19:01:09,484][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"https://elastic:xxxxxx@xxx.xx.xx.xx:43044/"}
[2024-06-20T19:01:09,566][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"https://elastic:xxxxxx@xxx.xx.xx.xx:43044/"}
[2024-06-20T19:01:16,615][INFO ][logstash.outputs.elasticsearch][main] Not eligible for data streams because config contains one or more settings that are not compatible with data streams: {"index"=>"uat_epf_msg_log"}
[2024-06-20T19:01:16,616][INFO ][logstash.outputs.elasticsearch][main] Data streams auto configuration (`data_stream => auto` or unset) resolved to `false`
[2024-06-20T19:01:16,617][WARN ][logstash.outputs.elasticsearch][main] Elasticsearch Output configured with `ecs_compatibility => v8`, which resolved to an UNRELEASED preview of version 8.0.0 of the Elastic Common Schema. Once ECS v8 and an updated release of this plugin are publicly available, you will need to update this plugin to resolve this warning.
[2024-06-20T19:01:16,618][INFO ][logstash.outputs.elasticsearch][main] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["https://xxx.xx.xx.xx:43044", "https://xxx.xx.xx.xx:43044", "https://xxx.xx.xx.xx:43044", "https://xxx.xx.xx.xx:43044"]}
[2024-06-20T19:01:16,625][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[https://elastic:xxxxxx@xxx.xx.xx.xx:43044/, https://elastic:xxxxxx@xxx.xx.xx.xx:43044/, https://elastic:xxxxxx@xxx.xx.xx.xx:43044/, https://elastic:xxxxxx@xxx.xx.xx.xx:43044/]}}
[2024-06-20T19:01:16,626][INFO ][logstash.outputs.elasticsearch][main] Using a default mapping template {:es_version=>8, :ecs_compatibility=>:v8}
[2024-06-20T19:01:16,656][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"https://elastic:xxxxxx@xxx.xx.xx.xx:43044/"}
[2024-06-20T19:01:16,660][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch version determined (8.7.0) {:es_version=>8}
[2024-06-20T19:01:16,660][WARN ][logstash.outputs.elasticsearch][main] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>8}
[2024-06-20T19:01:16,708][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"https://elastic:xxxxxx@xxx.xx.xx.xx:43044/"}
[2024-06-20T19:01:16,751][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"https://elastic:xxxxxx@xxx.xx.xx.xx:43044/"}
[2024-06-20T19:01:16,797][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"https://elastic:xxxxxx@xxx.xx.xx.xx:43044/"}
[2024-06-20T19:01:17,215][INFO ][logstash.outputs.elasticsearch][main] Not eligible for data streams because config contains one or more settings that are not compatible with data streams: {"index"=>"proxy_statslog-%{+YYYY.MM.dd}", "document_type"=>"%{[@metadata][type]}"}
[2024-06-20T19:01:22,115][INFO ][logstash.outputs.elasticsearch][main] Data streams auto configuration (`data_stream => auto` or unset) resolved to `false`
[2024-06-20T19:01:22,116][WARN ][logstash.outputs.elasticsearch][main] Elasticsearch Output configured with `ecs_compatibility => v8`, which resolved to an UNRELEASED preview of version 8.0.0 of the Elastic Common Schema. Once ECS v8 and an updated release of this plugin are publicly available, you will need to update this plugin to resolve this warning.
[2024-06-20T19:01:22,126][WARN ][logstash.filters.grok    ][main] ECS v8 support is a preview of the unreleased ECS v8, and uses the v1 patterns. When Version 8 of the Elastic Common Schema becomes available, this plugin will need to be updated
[2024-06-20T19:01:22,136][INFO ][logstash.outputs.elasticsearch][main] Using a default mapping template {:es_version=>8, :ecs_compatibility=>:v8}
[2024-06-20T19:01:22,216][WARN ][logstash.filters.grok    ][main] ECS v8 support is a preview of the unreleased ECS v8, and uses the v1 patterns. When Version 8 of the Elastic Common Schema becomes available, this plugin will need to be updated
[2024-06-20T19:01:22,246][WARN ][logstash.filters.grok    ][main] ECS v8 support is a preview of the unreleased ECS v8, and uses the v1 patterns. When Version 8 of the Elastic Common Schema becomes available, this plugin will need to be updated
[2024-06-20T19:01:22,291][WARN ][logstash.filters.grok    ][main] ECS v8 support is a preview of the unreleased ECS v8, and uses the v1 patterns. When Version 8 of the Elastic Common Schema becomes available, this plugin will need to be updated
[2024-06-20T19:01:22,317][WARN ][logstash.filters.grok    ][main] ECS v8 support is a preview of the unreleased ECS v8, and uses the v1 patterns. When Version 8 of the Elastic Common Schema becomes available, this plugin will need to be updated
[2024-06-20T19:01:22,387][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>1000, "pipeline.sources"=>["/opt/total/logstash/config/mergedlogstash_uat.conf"], :thread=>"#<Thread:0x33c754c4@/opt/total/logstash-8.6.2/logstash-core/lib/logstash/java_pipeline.rb:131 run>"}
[2024-06-20T19:01:28,773][INFO ][logstash.javapipeline    ][main] Pipeline Java execution initialization time {"seconds"=>6.38}
[2024-06-20T19:01:28,829][INFO ][logstash.inputs.beats    ][main] Starting input listener {:address=>"0.0.0.0:5044"}
[2024-06-20T19:01:28,837][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2024-06-20T19:01:28,866][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2024-06-20T19:01:28,967][INFO ][org.logstash.beats.Server][main][d765812c289289352c8eb26386f720f01008f2fd08d2ee7fde1217b9bd7a5920] Starting server on port: 5044
/opt/total/logstash-8.6.2/vendor/bundle/jruby/2.6.0/gems/manticore-0.9.1-java/lib/manticore/client.rb:284: warning: already initialized constant Manticore::Client::HttpPost
/opt/total/logstash-8.6.2/vendor/bundle/jruby/2.6.0/gems/manticore-0.9.1-java/lib/manticore/client.rb:284: warning: already initialized constant Manticore::Client::HttpPost
/opt/total/logstash-8.6.2/vendor/bundle/jruby/2.6.0/gems/manticore-0.9.1-java/lib/manticore/client.rb:536: warning: already initialized constant Manticore::Client::StringEntity
/opt/total/logstash-8.6.2/vendor/bundle/jruby/2.6.0/gems/manticore-0.9.1-java/lib/manticore/client.rb:536: warning: already initialized constant Manticore::Client::StringEntity
/opt/total/logstash-8.6.2/vendor/bundle/jruby/2.6.0/gems/manticore-0.9.1-java/lib/manticore/client.rb:284: warning: already initialized constant Manticore::Client::HttpPost
[2024-06-20T19:01:43,926][INFO ][logstash.outputs.file    ][main][085e2f0de34ccc07275ef5b65528a683dac48e8f499fcb3f4de3b8375b51fb24] Opening file {:path=>"/tmp/tbstat.log"}



Redacted logstash.yml is as below:

input{
        beats {
             type => beats
             port => 5044
             client_inactivity_timeout => 0
         }
}
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

        if [type] == "proxy_statslog"  {
                mutate {
                        split => ["message", "~|~"]
                        add_field =>{
                           "createdTime" => "%{[message][0]}"
                           "type" => "%{[message][1]}"
                           "typeValue" => "%{[message][2]}"
                           "message" => "%{[message][3]}"
                       }
                }

        }
        mutate{
              convert => {
                          "type" => "string"
                          "typeValue" => "string"
                   }
              }

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

    if [type] == "proxy_statslog"{

     file {
            path => "/tmp/zztop.log"
            codec => rubydebug
           }

    elasticsearch {
     hosts => [ "https://xxx.xx.xx.xx:43044" ]
     hosts => [ "https://xxx.xx.xx.xx:43044" ]
     hosts => [ "https://xxx.xx.xx.xx:43044" ]
     hosts => [ "https://xxx.xx.xx.xx:43044" ]
     ssl => true
     cacert => "/opt/total/elasticsearch/config/certs/http_ca.crt"
     user => "elastic"
     password => "xxxxxxxxxxxxxxxxxxxxxx"
     index => "proxy_statslog-%{+YYYY.MM.dd}"
      document_type => "%{[@metadata][type]}"
   }
 }

Thanks

What are your Elasticsearch specs, like CPU, Memory, Heap and disk type?

This can also impact in logstash performance, if Elasticsearch can not process the events fast enough it will tell logstash to backoff.

Also, where is the rest of your pipeline? In the log you shared you have at least 2 different elasticsearch outputs:

[2024-06-20T19:01:16,615][INFO ][logstash.outputs.elasticsearch][main] Not eligible for data streams because config contains one or more settings that are not compatible with data streams: {"index"=>"uat_epf_msg_log"}
[2024-06-20T19:01:17,215][INFO ][logstash.outputs.elasticsearch][main] Not eligible for data streams because config contains one or more settings that are not compatible with data streams: {"index"=>"proxy_statslog-%{+YYYY.MM.dd}", "document_type"=>"%{[@metadata][type]}"}

Without seeing the entire pipeline it is not clear if you have any filter that may be slowing down the processing.

Another thing, can you share a sample of your message for the type proxy_statslog? I don't think this will help much, but your mutate filter could probably be replaced by a dissect filter.

Have you tried already to increase the pipeline.batch.size ? The default of 125 can be pretty low in some cases, for fast event rate pipeplines this could generate a lot of requests which will impact in the performance.

ES and Logstash reside on the same hosts.

Red Hat Enterprise Linux release 8.9
CPU --> 8 core
Memory --> 93GB
ES Heap --> Default --> -Xms31744m -Xmx31744m
Disk is a local volume mounted on a cloud VM

An identical output is shown for all the rest of the configured app types. e.g.

[2024-06-20T19:01:11,450][INFO ][logstash.outputs.elasticsearch][main] Not eligible for data streams because config contains one or more settings that are not compatible with data streams: {"index"=>"uat_fhl0by_webdmz_access"}
[2024-06-20T19:01:11,450][INFO ][logstash.outputs.elasticsearch][main] Data streams auto configuration (`data_stream => auto` or unset) resolved to `false`
[2024-06-20T19:01:11,451][WARN ][logstash.outputs.elasticsearch][main] Elasticsearch Output configured with `ecs_compatibility => v8`, which resolved to an UNRELEASED preview of version 8.0.0 of the Elastic Common Schema. Once ECS v8 and an updated release of this plugin are publicly available, you will need to update this plugin to resolve this warning.
[2024-06-20T19:01:11,451][INFO ][logstash.outputs.elasticsearch][main] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["https://xxx.xx.xx.xx:43044", "https://xxx.xx.xx.xx:43044", "https://xxx.xx.xx.xx:43044", "https://xxx.xx.xx.xx:43044"]}
[2024-06-20T19:01:11,457][INFO ][logstash.outputs.elasticsearch][main] Using a default mapping template {:es_version=>8, :ecs_compatibility=>:v8}
[2024-06-20T19:01:11,458][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[https://elastic:xxxxxx@xxx.xx.xx.xx:43044/, https://elastic:xxxxxx@xxx.xx.xx.xx:43044/, https://elastic:xxxxxx@xxx.xx.xx.xx:43044/, https://elastic:xxxxxx@xxx.xx.xx.xx:43044/]}}
[2024-06-20T19:01:11,485][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"https://elastic:xxxxxx@xxx.xx.xx.xx:43044/"}
[2024-06-20T19:01:11,488][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch version determined (8.7.0) {:es_version=>8}
[2024-06-20T19:01:11,489][WARN ][logstash.outputs.elasticsearch][main] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>8}
[2024-06-20T19:01:11,535][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"https://elastic:xxxxxx@xxx.xx.xx.xx:43044/"}
[2024-06-20T19:01:11,609][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"https://elastic:xxxxxx@xxx.xx.xx.xx:43044/"}
[2024-06-20T19:01:11,660][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"https://elastic:xxxxxx@xxx.xx.xx.xx:43044/"}

Please note there are a very large number of different index types e.g. "index"=>"uat_fhl0by_webdmz_access" configured in the logstash yml file, which I guess calls for tuning logstash.

Here is the output of a very fast moving log file:

2024-06-21 09:00:00,062 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 15
2024-06-21 09:00:00,072 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 16
2024-06-21 09:00:00,072 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 17
2024-06-21 09:00:00,084 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 18
2024-06-21 09:00:00,130 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 19
2024-06-21 09:00:00,134 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 20
2024-06-21 09:00:00,160 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 20
2024-06-21 09:00:00,191 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 20
2024-06-21 09:00:00,192 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 21
2024-06-21 09:00:00,206 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 22
2024-06-21 09:00:00,207 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 23
2024-06-21 09:00:00,209 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 24
2024-06-21 09:00:00,212 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 25
2024-06-21 09:00:00,266 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 19
2024-06-21 09:00:00,295 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 20
2024-06-21 09:00:00,315 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 21
2024-06-21 09:00:00,317 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 22
2024-06-21 09:00:00,317 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 23
2024-06-21 09:00:00,317 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 24
2024-06-21 09:00:00,341 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 23
2024-06-21 09:00:00,351 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 24
2024-06-21 09:00:00,352 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 25
2024-06-21 09:00:00,353 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 26
2024-06-21 09:00:00,365 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 26
2024-06-21 09:00:00,371 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 27
2024-06-21 09:00:00,385 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 26
2024-06-21 09:00:00,395 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 27
2024-06-21 09:00:00,412 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 27
2024-06-21 09:00:00,452 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 27
2024-06-21 09:00:00,468 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 25
2024-06-21 09:00:00,541 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 23
2024-06-21 09:00:00,544 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 24
2024-06-21 09:00:00,547 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 25
2024-06-21 09:00:00,549 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 26
2024-06-21 09:00:00,550 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 27
2024-06-21 09:00:00,585 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 24
2024-06-21 09:00:00,599 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 24
2024-06-21 09:00:00,606 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 25
2024-06-21 09:00:00,607 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 26
2024-06-21 09:00:00,610 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 27
2024-06-21 09:00:00,621 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 26
2024-06-21 09:00:00,631 ~|~ Total-Inflight-Request ~|~ Inflight ~|~ 27

Can you please guide me on how could I do that? Shall I just update the pipelines.yml file? and with what values? My logstash startup script refers to logstash.yml. How can I ensure that the changes done in pipelines.yml are read by logstash (pardon my limited knowledge)

Thanks

Yeah, but what is the type? It is backed by SSD or HDD? HDD can be bad for performance.

How are you starting logstash? Normally you would run logstash as a service on the server, this way it would use the pipelines.yml file to load the pipelines it needs to run.

You need to provide more context, you didn't your entire pipeline configuration as asked, also share your logstash.yml.

These are virtual disks on cloud VMs. This setup has been running for a long time now.

We are using a script that internally runs the following:

./logstash -r -f $LOGSTASH_CONFIG/mergedlogstash_uat.conf --path.data $LOGSTASH_PATH  > $LOGSTASH_HOME/logs/logstash-"$(date +"%Y_%m_%d_%I_%M_%p").log" 2>&1 &

The entire pipelines.yml is commented out and not being used.

Logstash.yml (mergedlogstash_uat.conf in our case) has become so long with the passage of time that it exceeds the allowable limit to post here completely. I pasted an edited version of mergedlogstash_uat.conf above. Its pretty much the same with type changing alongwith index name in the following stanza:

  if [type] == "proxy_statslog"{

Please guide on how to tweak the pipline.

Thanks

Still they will backed by HDD or SSD disks in the cloud provider, for example in GCP Standard persistent disks are backed by HDD drives and and Balanced persistent disks are backed by SSD drives.

The disk type will reflect on the read and write speed, which will them reflect on how fast logstash can read your files and how fast elasticsearch can write it into indices, so you need to check what is the disk type being used on your VM.

If this is running from a long time, when this problem started? What changed? The volume of the files increased?

To change the batch size when running with the cli the way you are running you would need to use the -b flag as described in the documentation.

So you may try to change it adding -b SIZE to your command, start with -b 250 and see if things improve.

Keep in mind that logstash.yml is not the same thing as the pipeline configuration, logstash has basically 3 main types of files, logstash.yml which has settings for the logstash execution, pipelines.yml which controls which pipelines are going to be executed and then the .conf file which are the configuration files with inputs, filters and outputs.

Nothing you shared until now provides any hint of any performance issue on Logstash side, if you want you can share your full configuration file using a gist on github.

As mentioned without taking a look into the full pipeline it is not possible to know what exactly logstash is doing and what can be an issue.

I'm not sure if this will make any difference, but I think your mutate with split and add_field can be replaced by the following dissect filter:

dissect {
    mapping => {
        "message" => "%{createdTime} ~|~ %{type} ~|~ %{typeValue} ~|~ %{message}"
    }
}

But I don't think this will make any diference.

Thanks a lot. I will get an answer for you.

This problem started when I attempted to process a very very fast moving log statistics.log from Filebeat to Logstash to Elasticsearch.

Please note that after that I tried to configure a slow moving log through the same route and index got created successfully which means that the issue is mainly because of the fast moving statistics.log.

Please note that this fast moving log is zipped and rotated every hour. See below:

-rw-r--r--. 1 total total  771413 Jun 23 07:00 statistics.2024-06-23_06.log.gz
-rw-r--r--. 1 total total  874770 Jun 23 08:00 statistics.2024-06-23_07.log.gz
-rw-r--r--. 1 total total  665094 Jun 23 09:00 statistics.2024-06-23_08.log.gz
-rw-r--r--. 1 total total  632922 Jun 23 10:00 statistics.2024-06-23_09.log.gz
-rw-r--r--. 1 total total  711005 Jun 23 11:00 statistics.2024-06-23_10.log.gz
-rw-r--r--. 1 total total  669895 Jun 23 12:00 statistics.2024-06-23_11.log.gz
-rw-r--r--. 1 total total  731907 Jun 23 13:00 statistics.2024-06-23_12.log.gz
-rw-r--r--. 1 total total  686603 Jun 23 14:00 statistics.2024-06-23_13.log.gz
-rw-r--r--. 1 total total  695047 Jun 23 15:00 statistics.2024-06-23_14.log.gz
-rw-r--r--. 1 total total  655612 Jun 23 16:00 statistics.2024-06-23_15.log.gz
-rw-r--r--. 1 total total  708257 Jun 23 17:00 statistics.2024-06-23_16.log.gz
-rw-r--r--. 1 total total  731080 Jun 23 18:00 statistics.2024-06-23_17.log.gz
-rw-r--r--. 1 total total  790868 Jun 23 19:00 statistics.2024-06-23_18.log.gz
-rw-r--r--. 1 total total  736724 Jun 23 20:00 statistics.2024-06-23_19.log.gz
-rw-r--r--. 1 total total  779824 Jun 23 21:00 statistics.2024-06-23_20.log.gz
-rw-r--r--. 1 total total 1120785 Jun 23 22:00 statistics.2024-06-23_21.log.gz
-rw-r--r--. 1 total total 4073915 Jun 23 22:23 statistics.log

Additionally when I try to run filebeat in debug mode and grep on statistics, I see the following output.

-bash-4.2$ tail -f filebeat_uat_vc.log|grep statistics
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
2024-06-23T22:16:58.015-0400    DEBUG   [harvester]     log/log.go:107  End of file reached: /opt/total/logs/probe/vc/statistics.log; Backoff now.
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
      "path": "/opt/total/logs/probe/vc/statistics.log"
2024-06-23T22:16:59.920-0400    DEBUG   [harvester]     log/log.go:107  End of file reached: /opt/total/logs/probe/vc/statistics.log; Backoff now.

Also it stops at above and does not roll more; which it should as the log is rolling very fast.

I tried with this option and logstash has the same response.

Please note that logstash.yml and pipelines.yml are completely commented out in my case. Only mergedlogstash_uat.conf is being referenced and it only has input, filter and output sections and nothing more

Thanks and look forward for some help.