Logstash SWAP OOM

We have the past months installed the ELK stack trying to follow the elastic documentation. Currently using logstash to push approx. 15 logs into our elastic indexes. Hoping to push all of our approx. 100 logs into different elastic indexes.

However we are running into problems with memory swap reaching around 1GB memory on most if not all of our logstash instances when a logstash instance has been running for more than around 24 hours. Resulting in crashes.

Execpt for one having 4 file inputs, the rest of our logstash .conf's have 1 file input.

We have one logstash "installation" runing ~10 logstash instances with different "--path.data" and config paths -f:

$>./logstash -f /mypath/myconfig.conf --path.data=/mypath/logstash-8.6.2/data-appname
Example Config
input {
    file {
        path => "/our/path/server.log"
        start_position => "end"
        stat_interval => "180 second"

        codec => multiline {
              pattern => "%{TIMESTAMP_ISO8601}\s+"
              negate => "true"
              what => "previous"
        }
     }
}
filter {
    grok {
        match => {"message" => "%{TIMESTAMP_ISO8601:logTime}\s+%{LOGLEVEL:logLevel}\s+\[%{DATA:category}\]\s+(%{DATA:thread})\s+\[%{DATA:class}\]\[%{DATA:id}\]:\s+%{DATA:message}"}
        add_tag => ["log"]
        overwrite => ["message"]
    }
    if "LOGSTASH" in [message] {
            grok {
                match => {"message" => "LOGSTASH\s+\[ProcessorMonitor{process=%{NUMBER:process:int},\s+outputProcess=%{NUMBER:outputProcess:int},\s+storeToAtmos=%{NUMBER:storeToAtmos:int},\s+exchangeLookup=%{NUMBER:exchangeLookup:int},\s+write=%{NUMBER:write:int},\s+parse=%{NUMBER:parse:int},\s+inputFileExtractor=%{NUMBER:inputFileExtractor:int},\s+documentTypeDetection=%{NUMBER:documentTypeDetection:int},\s+fileSplitting=%{NUMBER:fileSplitting:int},\s+duplicateControl=%{NUMBER:duplicateControl:int},\s+generateAndStoreAttachment=%{NUMBER:generateAndStoreAttachment:int},\s+mergeAttachment=%{NUMBER:mergeAttachment:int},\s+convertToTiff=%{NUMBER:convertToTiff:int},\s+distributionSender=%{NUMBER:distributionSender:int}}\]\s+\[TransactionStatus{status=%{DATA:transactionStatus},\s+subStatus=%{DATA:transactionSubStatus}}\]\s+\[%{DATA:channel}\]\s+\[%{DATA:senderId}\]\s+\[%{DATA:recipientId}\]"}
                add_tag => ["stats"]
                remove_tag => ["log"]
            }
    }
    date {
        match => ["logTime", "YYYY-MM-dd;HH:mm:ss.SSS", "ISO8601"]
        timezone => "Europe/Oslo"
        target => ["logTime"]
    }
}
output {
    if "log" in [tags] {
        elasticsearch {
            hosts => ["https://*****.no:9200"]
            user => "logstash_internal"
            password => "*****"
            cacert => "<CERT LOCATION>"
        }
    }
    if "stats" in [tags] {
        elasticsearch {
            hosts => ["https://*****.no:9200"]
                        user => "logstash_internal"
                        password => "*****"
                        cacert => "<CERT LOCATION>"
        }
    }
}

Questions

  • What SWAP usage should we expect?
  • Is it normal to reduce the SWAP in launch options?
  • (if yes) How to adjust SWAP size?
  • Are we doing something very wrong with our setup?
  • If we have 100 logs, is it normal to have approx. 100 logstash instances running with different "--path.data"?

Can you provide more context about how are you running Logstash? What do you mean by installation and instances?

Are you running 10 logstash processes on the same server? What are the resources of your server, RAM and CPU? Any particular reason to run Logstash like that?

I would say that running multiple instances of logstash on the same server is a bad approach, each instance will be a JVM running on your server and per default the jvm.options of Logstash will use 1 GB of memory, so if you run 10 instances on the same server at the same time, you would need 10 GB of memory for the JVM and of course your server would need to have almost the same value of free memory, so you would need a server with at least 20 GB of memory.

The best approach would be to run one instance of logstash as a service and configure logstash to run multiple pipelines.

1 Like

Thank you for your answer.

You are correct, we are now working on using multiple pipelines instead of multiple logstash instances. The reason we had multiple instances of logstash was simply because we didn't know any better.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.