Improve performance of Logstash data loading into ES

HI,

I am trying to load data into elasticsearch using logstash, but its loading very slow.
80000/minute into elasticsearch.
So, its taking 10 mins for 11GB data. I have requirement to reduce this to 10 secs.
Please help me what all changes I need to do in ELK and in system Hardware .
The following is my Logstash config

input {
file {
type => "engine"
path => "Path/*"
start_position => "beginning"
sincedb_path => "/dev/null"
ignore_older => 0
codec => multiline {
pattern => "^qns|^lb0"
negate => true
what => "previous"
}
}
}
filter {
if [type] == "engine" {
mutate
{
gsub => ["message", "\n", ""]
}
mutate
{
gsub => ["message", "\r", ""]
}
mutate
{
gsub => ["message", "\t", ""]
}
mutate
{
gsub => ["message", "=", ""]
}
grok {
break_on_match => false
match => ["message", "%{WORD:VM} [%{TIMESTAMP_ISO8601:Timestamp}] %{GREEDYDATA:Record}",
"Record", "POLICY RESULT ERROR: %{DATA:Policy_Result_Error}session action %{GREEDYDATA:Remaining}"]
}

mutate { add_field => { "Policy_Result_Error_Analyzed" => "%{Policy_Result_Error}" }
remove_field => ["message","Remaining"] }

date {
match => [ "Timestamp" , "yyyy-MM-dd HH:mm:ss,SSS" ]
}
}
}
output {
if [type] == "engine" {

stdout { codec => rubydebug }

elasticsearch {
    hosts => "IP:9200"
    index => "hathway_test_engine"
    workers => 40
    flush_size => 1000     
              }

}
}

I have 3-node cluster in AWS
My system hardware config of each node as follows
224 gb RAM, 30 GB Hrd disk with 32-core

I did following changes in /etc/default/logstash

Arguments to pass to logstash agent
LS_OPTS="-b 1000 -w 40"

Arguments to pass to java
LS_HEAP_SIZE="128g"

LS_JAVA_OPTS="-Djava.io.tmpdir=$HOME"
LS_JAVA_OPTS="-Xmx100g -Xms4G"

also in bin/logstash

LS_HEAP_SIZE="xxx" size for the -Xmx${LS_HEAP_SIZE} maximum Java heap size option, default is "1g"
LS_HEAP_SIZE="128g"

LS_JAVA_OPTS="xxx" to append extra options to the defaults JAVA_OPTS provided by logstash
LS_JAVA_OPTS="-Xmx100g -Xms4g"

but still i don't see any improvement in logstash performance.

The following are screen shots of CPU/Memory/IO when I was running Logstash,

Instance-1

Instance-2

Instance-3

Instance-1/CPU

Instance-3/CPU

Instance-2/CPU

Instance-1/Memory

Instance-3/memory

Instance-2/Memory

Instance-1/IOSTAT

Instance-2/IOSTAT

Instance-3/IOSTAT

How many files is your data spread across? What does you data look like? Which version of Logstash and Elasticsearch are you using?

It also looks to me like you are deleting fields that are never created. Does this indicate that what you have posted is only a partial config?

Hi Christian,

Thanks for your quick reply,
I am using 2.4 version of both Logstash and ES.
I have total 11 file each 1GB and remove field thing was by mistake I modified that.

My log is multiline log, it look like as follows,

qns03 [2017-03-17 16:25:25,315] ===============================================
POLICY RESULT ERROR: :Login-User_SUSPENDED
session action = Create
credential = sk.157
domainId = Default
subscriberId = 00235000e4b0dc4458bf6c6b
SERVICES: FUP-D5M-U5M-T-D1M-U1M
TRIGGER: com.broadhop.radius.messages.impl.RadiusAccessRequestMessage request:
NAS-PORT = 0
CISCO-AVPAIR = client-mac-address=e46f.13c1.f346
NAS-PORT-ID = 0/3/0/505
FRAMED-PROTOCOL = 1
NAS-PORT-TYPE = 5
CALLING-STATION-ID = e4-6f-13-c1-f3-46
NAS-IP-ADDRESS = 202.88.216.192
USER-NAME = sk.157
SERVICE-TYPE = 2
USER-PASSWORD = xxxxxxxxx
ACCT-SESSION-ID = 02413493
NAS-IDENTIFIER = Noida_ISG-2.hathway
DEBUG MSGS:
INFO : (core) Tagging message with ID: com.broadhop.radius.impl.devicemanager.IsgNetworkDeviceManager
INFO : (radius) RADIUS device group assigned to: Noida ISG-2
INFO : (radius) Checking for duplicates by MAC Address E46F.13C1.F346
INFO : (core) Start session triggered
INFO : (core) Stop session triggered
INFO : (FTTX Policy) FTTX NASPORTID 0/3/0/505
INFO : (auth) TAL not attempted no TAL credential found for user
INFO : (auth) Success USUM_AUTHORIZATION
INFO : (core) Switching credential id to sk.157 for session
INFO : (service) Balance processing enabled for subscriber
INFO : (action) Updated credential: E46F.13C1.F346, with last used date avp
ERROR : (balance) OneTimeUsageCharge has a source field of: RadiusAccountingTotalBytesRetriever which returned a non-numeric value: null
INFO : (balance) Error found, rolling back transaction
ERROR : (core) Error processing policy request: :Login-User_SUSPENDED
ERROR : (core) Error processing policy request: :Login-User_SUSPENDED
SENT MESSAGES (asynchronous):
Thanks,
Uday.K

Given that you have a few potentially CPU intensive gsub and grok filters in your config, I am surprised that Logstash is not using more CPU. Might it be that you are limited by the performance of the file input plugin? You might be able to check this by setting up a separate file input for each file and see if that makes any difference.

So, its taking 10 mins for 11GB data. I have requirement to reduce this to 10 secs.

That's 1 GB per second. That's... a lot for a three-node ES cluster.

1 Like

Hi Magnus,

Thanks for you quick response.

can we reduce loading time to less than 10 mins ?

If yes, what changes (Either in my hardware/config changes)I need to do for that?

HI Christian,

Thanks for your response,

You mean to create 10 inputs for 10 file, in Logsatsh configuration and test CPU usage.

can we reduce loading time to less than 10 mins ?

Maybe. What's your current bottleneck? For example, if you have nothing but your file input in your Logstash configuration, what event rate are you getting?

How many input files do you have? If the number of files is too small it's hard to parallelize the reading of them and then it'll be hard to improve the performance. Ultimately you might be in the hands of the multiline codec and its configuration.

Yes, I would try creating a separate file input block per file to see if that makes a difference.

Ah yes—you definitely want to have one file input per file so that Logstash at least parallelizes the reading of each file instead of reading from them serially.

Hi Christian/Magnus,

Thanks for your response.

If we give separate input for every file on same path, How this gonna work in real time. everytime the path is updated with new files.
How can we read automatically with out manual intervention.

If we give separate input for every file on same path, How this gonna work in real time. everytime the path is updated with new files.

Well, in that case you have to generate a new configuration file and reload Logstash.

If you want the fastest possible ingestion of files you shouldn't use wildcards in the file input's path option.

I did not necessarily suggest this as a final solution, but rather as a test to see if the level of parallelism for inputs is an issue.

A little more optimization if you're looking for...

  • let filebeat handle multiline
  • let logstash not do multiline parsing at all and use more workers on LS
  • increase the batch size/ worker on es output [ till es can take :o - dangerous ]
  • let filebeat handle regex for filename [ like toload.*.log for toload.1.log etc.... ]

that will be one time setup and you keep adding new files and removing the older ones.

Apart from what Jay said, one additional setting is to make sure shards are distributed across different nodes.
Which will utilize 3 nodes which will process in parallel, instead you have all the primary shards on the same node.

Hi Jay/Ranjith,

Thanks for your replay.

I will try with that.

Thanks,
Uday.K

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.