config:
filebeat:
prospectors:
-
paths:
- /apps/logs/All.log
input_type: log
document_type: prod
scan_frequency: 5s
harvester_buffer_size: 32768
force_close_files: true
publish_async: true
output:
logstash:
hosts: ["logstash01.corp.pvt:5044", "logstash01.corp.pvt:5052"]
worker: 16
loadbalance: true
shipper:
logging:
to_files: false
files:
path: /apps/logs/tibco/logstash
name: mybeat
keepfiles: 7
logstash config:
input{
beats{
port => 5050
}
}
filter{
metrics {
meter => "events"
add_tag => "metric"
flush_interval => 60
}
}
output{
if "metric" in [tags] {
stdout {
codec => line {
format => "rate: %{[events][rate_1m]}"
}
}
}
null{}
}
sample of results:
rate: 4729.901062624514
rate: 4864.084502204622
rate: 4800.204394300626
I feel this is very slow, and i have tried many modifications to find improvements in performance, to no avail. I have also attempted this via localhost to remove network from the mix, and output to /dev/null to take disk i/o out of the mix. Nothing seems to help. The following are some of my filebeat changes:
compression: 0, 1, 3, 5, 9
harvester: default, 262144, 512288
async: off, on
max_bulk_size: default, 3000, 4096, 12288
scan_frequency: 0, 1, 5, 15
workers: 8, 16 (logstash piece is on a 16 core physical)
Am i missing something major? do those scan results mean ~4500 lines per minute? or is that per second averaged over a minute? I read about people achieving 18k lines per second or more and cannot get anywhere near that. here is an example of a piece of the log:
<ns0:Description><?xml version="1.0" encoding="UTF-8"?>
<tns:validateAddAndGeo xmlns:tns="http://session.com/">
<validateAddRequest>
<userId>burns</userId>
<crisId>12345</crisId>
<applicationId>Aggregator</applicationId>
<trackingId/>
<streetAddressLine1>742 Evergreen Terrace</streetAddressLine1>
<streetAddressLine2/>
<city>Springfield</city>
<state>IDK</state>
<zipCode>
**********
</zipCode>
</validateAddRequest>
</tns:validateAddAndGeo></ns0:Description>
</ns0:DebugLogRecord>
any help is appreciated as im kinda stuck.