Hi, I am new to logstash and I've been doing some reading and grok debugging. My goal is really simple. I have a Linux box with logstash and I want to receive syslog messages from all my systems and forward them to Google BigQuery.
Below is my configuration:
(some sensitive details left out or masked)
input {
syslog {
port => 10514
grok_pattern => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program} %{GREEDYDATA:message}"
}
}
output {
google_bigquery {
project_id => "*****"
dataset => "logs"
csv_schema => "syslog_timestamp:TIMESTAMP,syslog_hostname:STRING,syslog_program:STRING,message:STRING"
error_directory => "/tmp/bigquery-errors"
date_pattern => "%Y-%m-%dT%H:00"
flush_interval_secs => 30
skip_invalid_rows => true
}
}
Here is the error I am trying to troubleshoot:
Sep 12 10:34:29 server1 logstash[41815]: [2023-09-12T05:34:29,448][INFO ][logstash.outputs.googlebigquery][main] Publishing 2 messages to logstash_2023_09_12T05_00
Sep 12 10:34:30 server1 logstash[41815]: [2023-09-12T05:34:30,024][WARN ][logstash.outputs.googlebigquery][main] Error while inserting {:key=>0, :location=>"log", :message=>"no such field: log.", :reason=>"invalid"}
Sep 12 10:34:30 server1 logstash[41815]: [2023-09-12T05:34:30,025][WARN ][logstash.outputs.googlebigquery][main] Error while inserting {:key=>1, :location=>"version", :message=>"no such field: version.", :reason=>"invalid"}
Sep 12 10:34:30 server1 logstash[41815]: [2023-09-12T05:34:30,026][INFO ][logstash.outputs.googlebigquery][main] Problem data is being stored in: /tmp/bigquery-errors/logstash_2023_09_12T05_00-1694514870.log
I believe these are the two log messages it errored on:
{"syslog_timestamp":"Sep 12 05:34:08","service":{"type":"system"},"timestamp":"2023-09-12T10:34:08.629841493Z","syslog_program":"rc_service:","log":{"syslog":{"facility":{"code":0,"name":"kernel"},"priority":0,"severity":{"code":0,"name":"Emergency"}}},"version":"1","message":"httpd 25066:notify_rc restart_logger\n","host":{"ip":"192.168.88.226"},"syslog_hostname":"ap1-D018EB1-C","event":{"original":"<8>Sep 12 05:34:08 ap1-D018EB1-C rc_service: httpd 25066:notify_rc restart_logger\n"}}
{"syslog_timestamp":"Sep 12 05:34:08","service":{"type":"system"},"timestamp":"2023-09-12T10:34:08.765107583Z","syslog_program":"kernel:","log":{"syslog":{"facility":{"code":0,"name":"kernel"},"priority":0,"severity":{"code":0,"name":"Emergency"}}},"version":"1","message":"klogd started: BusyBox v1.24.1 (2023-05-06 11:35:34 CST)\n","host":{"ip":"192.168.88.226"},"syslog_hostname":"ap1-D018EB1-C","event":{"original":"<13>Sep 12 05:34:08 ap1-D018EB1-C kernel: klogd started: BusyBox v1.24.1 (2023-05-06 11:35:34 CST)\n"}}
I appreciate any input. I am new to this so any kind guidance would be suggested. I feel like maybe a filter would help but I'm not comfortable with them yet.
Thank you