Recommended JVM settings?

this is the settings I gave logstash:

-Xms2g
-Xmx4g

The machine has 8 cores and 24 GB of ram.

My problem is that I've only added about 10-15 machines with audit&filebeat and all the cores are over 80% and some reach 100 every few seconds.

Should I change the JVM settings to something else? I don't really mind if processing takes a while, I just want to go easy on the machine while not having logstash crash.

Thanks ahead!

From the documentation:

The recommended heap size for typical ingestion scenarios should be no less than 4GB and no more than 8GB.

And

Set the minimum (Xms) and maximum (Xmx) heap allocation size to the same value to prevent the heap from resizing at runtime, which is a very costly process.

Since you are already using 4GB, try to set both Xms and Xmx to 6GB and see if you have any improvement.

But what is causing your high cpu usage has more chances to be related to your pipelines than to the heap size.

What does your pipeline looks like, what filters are you using? Can you share your pipelines?

And Is this machine for logstash only or do you have anything else running on it?

1 Like

Thank you so much for the rich response,

this is what my logstash config looks like, I'm not sure if it's the most resource effective. I've added comments for each section:

input{
beats{
port => 5044
ssl => true
ssl_certificate_authorities => ["/etc/elk/certs/ca/ca.crt"]
ssl_certificate => "/etc/elk/instances/instances.crt"
ssl_key => "/etc/elk/instances/logs.pkcs8.key"
ssl_key_passphrase => "*******"
ssl_verify_mode => "force_peer"
}
}
filter {
mutate {
        add_tag => [ "insidefilter" ]
      }

# Do the following parsing for MongoDB logs

if [log][file][path] =~ "mongo.*\.log$"{
grok { match => { "message" => "\A%{TIMESTAMP_ISO8601} I ACCESS   %{NOTSPACE} SCRAM-SHA-1 authentication failed for %{USER:User} on %{USER:DB} from client %{SYSLOGHOST:From}:%{INT:Port} ; %{GREEDYDATA:Reason}"
}
}
mutate { add_tag => [ "mongosIfStatement" ] }
}

# Do The following parsing for MySQL logs

if [log][file][path] =~ "mysql.*\.log$" {
grok { match => { "message" => "%{TIMESTAMP_ISO8601:Date} %{INT:Number} \[Note] Access denied for user '%{USER:User}'@'%{SYSLOGHOST:IP}' \(using password: %{WORD:Password}%{GREEDYDATA}"
}
}
mutate { add_tag => [ "mysqldIfStatement" ] }
} 

# Do the following parsing for any .json documents

if [log][file][path] =~ /\.json$/ {
     json {
        source => "message"
     }
  }

# Do the following for a custom file I made called "commands.log" This is the command history of all remote servers.

if [log][file][path] =~ "commands.log" {
grok{
match => { "message" => "\A%{SYSLOGTIMESTAMP:sys_timestamp} %{NOTSPACE:Hostname} %{USER:Logged}: USER=%{USER:User} PWD=%{UNIXPATH:Directory} PID=\[%{INT:PID}] CMD=\"%{DATA:Command}\" Exit=\[%{INT:Exit}\] CONNECTION=%{GREEDYDATA:Connection}"}
match => { "message" => ["\[(%{TIMESTAMP_ISO8601:sys_timestamp})\]\s(?<Hostname>[0-9a-zA-Z_-]+)\s(?<Logged>[0-9a-zA-Z_-]+)\:USER=(?<User>[0-9a-zA-Z_-]+)\sPWD=(?<Directory>[0-9a-zA-Z_/-]+)\sPID=\[(?<PID>[0-9]+)\]\sCMD=\"(?<Command>.*)\"\sExit=\[(?<Exit>[0-9]+)\]\sCONNECTION=(?<Connetion>.*)", "\A%{SYSLOGTIMESTAMP:sys_timestamp} %{NOTSPACE:Hostname} %{USER:Logged}: USER=%{USER:User} PWD=%{UNIXPATH:Directory} PID=\[%{INT:PID}] CMD=%{QUOTEDSTRING:Command} Exit=\[%{INT:Exit}] CONNECTION=%{GREEDYDATA:Connection}"]
}
match => { "message" => "\A%{SYSLOGTIMESTAMP:sys_timestamp} %{HOSTNAME:Hostname} %{USER:Logged}: USER=%{USER:User} PWD=%{UNIXPATH:Directory} PID=\[%{INT:PID}] CMD=%{QUOTEDSTRING:Command} Exit=\[%{INT:Exit}] CONNECTION=%{GREEDYDATA:Connection}"}
}
}
}
output{

# If our honeypot (called cowrie) gets someone, send me an email

  if [log][file][path] =~ "cowrie.json" {
    if [message] =~ "^New connection" {
    email {
      from => 'honeypot-entries@companyname-elk.net'
      to => 'john@companyname.net'
      subject => 'Honeypot Alert'
      body => "Someone interacted with the honeypot!\nDetails: %{message}\nClick here to view the dashboard."
      domain => 'mail.company.net'
      port => 25
    }
  }
}
if [@metadata][pipeline] {
    elasticsearch {
      hosts => "https://localhost:9200"
      manage_template => false
      cacert => "/etc/elasticsearch/estackcap12extract.crt"
      ssl => true
      ssl_certificate_verification => false
      index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
      pipeline => "%{[@metadata][pipeline]}" 
      user => "elastic"
      password => "******"
    }
  } else { 
elasticsearch {
# manage_template => false
hosts => ["https://localhost:9200"]
cacert => "/etc/elasticsearch/estackcap12extract.crt"
ssl => true 
ssl_certificate_verification => false
index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
#index => "cleandata"
user => "elastic"
password => "********"
}
}
}

Unfortunately, I've deployed Logstash, Elasticsearch, and Kibana all in one machine and now I'm starting to feel like it was a huge mistake.