Logstash-forwarder setup... where is my data?


(Alejandro Olivan) #1

I again forum...

I have setup a server to send log data to a central logstash server... after some pain with TLS IP SAN certificates it seems it connects and does something according to forwarder logs:

root@nodo01:/etc/logstash-forwarder# tail /var/log/syslog
May 13 17:58:28 nodo01 logstash-forwarder[28176]: 2015/05/13 17:58:28.913715 Registrar received 3 events
May 13 17:58:41 nodo01 logstash-forwarder[28176]: 2015/05/13 17:58:41.401484 Registrar received 4 events
May 13 17:58:46 nodo01 logstash-forwarder[28176]: 2015/05/13 17:58:46.402178 Registrar received 1 events
May 13 17:58:53 nodo01 logstash-forwarder[28176]: 2015/05/13 17:58:53.901988 Registrar received 1 events
May 13 17:58:58 nodo01 logstash-forwarder[28176]: 2015/05/13 17:58:58.986889 Registrar received 2 events

The problem is, on the receiving side, I either have not the data, it is being droped or it is simply lost somewhere....

To be exact.. I will paste the setups...
Here is the sending side

{
"network": {
"servers": [ "172.16.0.12:5000" ],
"ssl certificate": "/etc/logstash-forwarder/certs/notsecure.crt",
"ssl key": "/etc/logstash-forwarder/certs/notsecure.key",
"ssl ca": "/etc/logstash-forwarder/certs/notsecure.crt"
},
"files": [
{
"paths": [ "/var/log/shoutcast/stream1.w3c.log" ],
"fields": {
"type": "logs",
"stream_type": "Shoutcast-1.9.8",
"cluster_node": "nodo01.cloud01",
"cluster_stream": "stream1"

     }
},
{
  "paths": [ "/var/log/shoutcast/stream2.w3c.log" ],
  "fields": {
    "type": "logs",
    "stream_type": "Shoutcast-1.9.8",
    "cluster_node": "nodo01.cloud01",
    "cluster_stream": "stream2"
     }
}

]
}

On the receiving side... I got this on the input file config
#logstash-forwarder
input {
lumberjack {
port => 5000
type => "logs"
ssl_certificate => "/etc/logstash/certs/notsecure.crt"
ssl_key => "/etc/logstash/certs/notsecure.key"
}
}

I have an early filter file that "tags" inputs before actual filter/grok files
I added a new filter in it filter {
if [stream_type] == "Shoutcast-1.0.8" {
mutate {
add_tag => "Shoutcast-w3c"
}
}
}

And finally my filter file...

filter {
if "Shoutcast-w3c" in [tags] {

grok {
  match => [ "message", "%{IP:src_ip} %{IP:src_dns} (?<mytimestamp>%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}) %{NOTSPACE:str$
}

mutate {
  gsub => [
            "stream", "\/stream\?title=", ""
          ]
  }

urldecode {
    field => "stream"
}
urldecode {
    field => "user_agent"
}

date {
  match => [ "mytimestamp", "YYYY-MM-dd HH:mm:ss,SSS" ]
  locale => "en"
  add_tag => [ "tsmatch" ]
}


mutate {
  #remove_field => [ "message" ]
  remove_tag => [ "_grokparsefailure" ]
}

......
}

The problem is, obviously, I got not usable data...

Although querying elasticsearch from the CLI seems to throw some results, as if something partially landed in the database.... like this:

curl -XGET 'http://localhost:9200/_all/_search?q=stream_type:Shoutcast-1.9.8&pretty=true'

{
"took" : 12,
"timed_out" : false,
"_shards" : {
"total" : 10,
"successful" : 10,
"failed" : 0
},
"hits" : {
"total" : 927,
"max_score" : 12.42277,
"hits" : [ {
"_index" : "logstash-2015.05.13",
"_type" : "logs",
"_id" : "cwKGoMe1RX6bpks0Wgkatg",
"_score" : 12.42277,
"_source":{"@version":"1","@timestamp":"2015-05-13T15:10:36.400Z","type":"logs","file":"/var/log/shoutcast/stram2.w3c.log","host":"nodo01","offset":"1892629","stream_type":"Shoutcast-1.9.8","cluster_node":"nodo01.cloud01","cluster_stream":"stream2","tags":["Shoutcast-w3c"],"stream":null,"user_agent":null}
}, {
..........
...... 5 more replies (do they belong to every "shard"??? )......
..........

The problem is I lost the fields I got when accesing the log file locally... the grok filter worked, but now there is no useable data.
I notice It seems I do not have my actual "message" in the results... just a kind of header with no actual data.
In kibana, there have appeared new filter fields available, such as stream_type and so... the ones I defined on logstash-forwarder side.

AS anyone some clue?

Thank you for your patience!

Best regards!


(Magnus B├Ąck) #2

To avoid confusion and wild goose chases, disable the elasticsearch output for now and just use a stdout output. Then you'll see exactly when events arrive and exactly what they look like. You might want to disable your filters until you understand more of what's going on.


(Alejandro Olivan) #3

Aha... clearly this would help.

Actually, in fact, part of the problem comes by the fact of tons of firewall / suricata stuff arriveing to the server an cluttering everything...
So I would have to "disable" tho other inputs while "disabling" logstash and sending to stdout...
This will of course make thing simpler and easier to debug...

But... wait... again, your commentss make me think: Would it be possible to selectively send something to stdout?... I wonder if this is possible or has been tried... would be great for testing.
something using if [mytestTag] then output to stdout...
I think I have read around people using if _grokparsefailure not in tags then output to elasticsearch...

Tomorrow I have lots of experiments to do!!!

Thank you very much!!!


(Alejandro Olivan) #4

I found a mismatch on the grok filter!!!
I can't explain why, but sended logs differ in one field from file readed logs... as I sended everything coming from the forwarder to stdout as suggested... voila! I could spot the entries flowing in, and realize about the issue!

Great!!! SOLVED!!!!


(system) #5