Hello,
since a few days I am trying to solve a problem of failing logstash.
I am downloading ip database using http_poller. It is approximately 30k IP addresses.
I want to change the _id field using fingerprint filter plugin (it works well). However, it crashes from unknown reason. If I am in normal logstash mode I am able to load approximately 2000 messages, but when I am in full debug mode I am getting like 28k. Please see the configuration below.
config file:
input {
http_poller {
urls => {
blocklist_de_all => "http://lists.blocklist.de/lists/all.txt"
}
request_timeout => 30
tags => ["blocklist"]
codec => "line"
validate_after_inactivity => 200
schedule => { cron => "*/30 * * * *" }
metadata_target => "feed_metadata"
}
}filter {
split {
field => "[message]"
}
if ([message] =~ /^#/) {
drop{}
}
else {
grok {
match => { "message" => "^%{GREEDYDATA:ipv4address}" }
}
}
geoip {
source => "ipv4address"
add_tag => [ "ipv4enriched" ]
add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ]
}
mutate {
convert => [ "[geoip][coordinates]", "float"]
}fingerprint { id => "blocklist1" source => [ "ipv4address" ] method => [ "SHA512" ] add_tag => [ "fingerprinted" ]
}
}output {
elasticsearch {
hosts => ["10.0.50.51:9200"]
index => "ipv4_to_block"
document_id => "%{fingerprint}"
document_type => "default"
}
}
pipeline config:
- pipeline.id: blocklist_ips
path.config: "/etc/logstash/conf.d/blocklist_de_all_low_confidence.conf"
pipeline.workers: 16
I am getting what I want in Elasticsearch (output from Kibana):
{
"_index": "ipv4_to_block",
"_type": "default",
"_id": "1eda4277c8b054652a08b0f56f26656babbe8328",
"_version": 1,
"_score": 1,
"_source": {
"fingerprint": "1eda4277c8b054652a08b0f56f26656babbe8328",
"@version": "1",
"metadata": {
"host": "elk2",
"name": "blocklist_de_all",
"request": {
"method": "get",
"url": "http://lists.blocklist.de/lists/all.txt"
},
"code": 200,
"response_message": "OK",
"runtime_seconds": 0.115168,
"times_retried": 0,
"response_headers": {
"connection": "keep-alive",
"content-type": "text/plain; charset=UTF-8",
"transfer-encoding": "chunked",
"date": "Tue, 28 Aug 2018 14:16:55 GMT",
"last-modified": "Tue, 28 Aug 2018 14:14:10 GMT",
"cache-control": "public",
"x-frame-options": "sameorigin",
"keep-alive": "timeout=20",
"server": "nginx/1.12.2",
"etag": "W/"6550b-5747f74a5da2f""
}
},
"tags": [
"blocklist",
"_geoip_lookup_failure",
"fingerprinted"
],
"ipv4address": "103.115.180.188",
"@timestamp": "2018-08-28T14:21:00.411Z",
"message": "103.115.180.188",
"geoip": {}
},
"fields": {
"@timestamp": [
"2018-08-28T14:21:00.411Z"
]
}
}
Configured logstash in debug mode:
curl -XPUT 'localhost:9600/_node/logging?pretty' -H 'Content-Type: application/json' -d'
{
"logger.logstash.agent" : "DEBUG",
"logger.logstash.api.service" : "DEBUG",
"logger.logstash.codecs.json" : "DEBUG",
"logger.logstash.codecs.line" : "DEBUG",
"logger.logstash.codecs.plain" : "DEBUG",
"logger.logstash.config.source.local.configpathloader" : "DEBUG",
"logger.logstash.config.source.multilocal" : "DEBUG",
"logger.logstash.config.sourceloader" : "DEBUG",
"logger.logstash.configmanagement.extension" : "DEBUG",
"logger.logstash.filters.drop" : "DEBUG",
"logger.logstash.filters.grok" : "DEBUG",
"logger.logstash.filters.split" : "DEBUG",
"logger.logstash.inputs.http_poller" : "DEBUG",
"logger.logstash.instrument.periodicpoller.deadletterqueue" : "DEBUG",
"logger.logstash.instrument.periodicpoller.jvm" : "INFO",
"logger.logstash.instrument.periodicpoller.os" : "DEBUG",
"logger.logstash.instrument.periodicpoller.persistentqueue" : "DEBUG",
"logger.logstash.modules.scaffold" : "DEBUG",
"logger.logstash.modules.xpackscaffold" : "DEBUG",
"logger.logstash.monitoringextension" : "DEBUG",
"logger.logstash.monitoringextension.pipelineregisterhook" : "DEBUG",
"logger.logstash.outputs.elasticsearch" : "DEBUG",
"logger.logstash.outputs.file" : "DEBUG",
"logger.logstash.pipeline" : "INFO",
"logger.logstash.plugins.registry" : "DEBUG",
"logger.logstash.runner" : "DEBUG",
"logger.org.logstash.Logstash" : "DEBUG",
"logger.org.logstash.common.DeadLetterQueueFactory" : "DEBUG",
"logger.org.logstash.common.io.DeadLetterQueueWriter" : "DEBUG",
"logger.org.logstash.config.ir.CompiledPipeline" : "DEBUG",
"logger.org.logstash.instrument.metrics.gauge.LazyDelegatingGauge" : "DEBUG",
"logger.org.logstash.plugins.pipeline.PipelineBus" : "DEBUG",
"logger.org.logstash.secret.store.SecretStoreFactory" : "DEBUG",
"logger.slowlog.logstash.codecs.json" : "DEBUG",
"logger.slowlog.logstash.aodecs.line" : "DEBUG",
"logger.slowlog.logstash.codecs.plain" : "DEBUG",
"logger.slowlog.logstash.filters.drop" : "DEBUG",
"logger.slowlog.logstash.filters.grok" : "DEBUG",
"logger.slowlog.logstash.filters.split" : "DEBUG",
"logger.slowlog.logstash.inputs.http_poller" : "DEBUG",
"logger.slowlog.logstash.outputs.elasticsearch" : "DEBUG",
"logger.slowlog.logstash.outputs.file" : "DEBUG"
}
'
Why the logstash loads less data if not being in full debug mode?
Why this configuration fails anyway?
Is it because it cannot handle the load of hashing 30k or failing on the local ip database lookup?
Thank you for the help in advance.