Process VMWare ESXi syslog with multiline events

I'm trying to process syslog events that are send by an vmware ESXi server with logstash. This is working fine except multiline events .

Below an example of the raw data that is sent via syslog:

<166>2018-05-16T08:57:41.409Z host.domain.somewhere Hostd: info hostd[B885B70] [Originator@6876 sub=Hbrsvc] Replicator: UnregisterListener triggered for config VM 749
<166>2018-05-16T08:57:41.410Z host.domain.somewhere Hostd: info hostd[B4C2B70] [Originator@6876 sub=Vcsvc.VMotionSrc (5610940323356380425)] CompleteOp: Vmotion task succeeded with result: ( {
<166>2018-05-16T08:57:41.410Z host.domain.somewhere Hostd: --> vmDowntime = 4072,
<166>2018-05-16T08:57:41.410Z host.domain.somewhere Hostd: --> vmPrecopyStunTime = 650021,
<166>2018-05-16T08:57:41.410Z host.domain.somewhere Hostd: --> vmPrecopyBandwidth = 726706228
<166>2018-05-16T08:57:41.410Z host.domain.somewhere Hostd: --> }
<163>2018-05-16T08:57:41.410Z host.domain.somewhere Hostd: error hostd[B4C2B70] [Originator@6876 sub=VigorStatsProvider(179323808).GuestStats(749)] VigorCallback received fault: Disconnected from virtual machine.
<163>2018-05-16T08:57:41.410Z host.domain.somewhere Hostd: --> Remote disconnected
<163>2018-05-16T08:57:41.410Z host.domain.somewhere Hostd: -->
<166>2018-05-16T08:57:41.411Z host.domain.somewhere Fdm: info fdm[3840B70] [Originator@6876 sub=Invt opID=SWI-5a4d466d] [InventoryManagerImpl::RemoveVmLocked] vm /vmfs/volumes/ab8d5fbc-81dbb0f9/XXXXXXXXX/XXXXXXXXX.vmx (not protected) removed from local host; on 0 hosts

Each multiline item contains "-->" and can be matched with this grok filter:

%{SYSLOG5424PRI}%{TIMESTAMP_ISO8601:SyslogTimestamp} %{IPORHOST:Hostname} %{PROG:AppName}: -->%{GREEDYDATA:Message}

How can I combine the multiline items so it becomes 1 event ?


This will combine them. I don't think you will like the result though :slight_smile:

stdin { codec => multiline { pattern => "-->" negate => false what => "previous" auto_flush_interval => 3 } }

No other solution ?

Can I use filebeat for this ?
Do I need to parse the lines into structured data (json) ?


filebeat can combine multiple lines into a single event in the same way that logstash can. I would stick to a logstash codec if it works. I would then mutate+gsub to remove all the "<166>2018-05-16T08:57:41.410Z host.domain.somewhere Hostd: -->" junk, since it is repeated information. Then parse out whatever fields are useful to you.

Ok, when a mutiline item occures, a "multiline" tag is added so I can check for that.

The multiline event is parsed into this:

"message" => "<166>2018-05-16T08:57:41.410Z host.domain.somewhere Hostd: info hostd[B4C2B70] [Originator@6876 sub=Vcsvc.VMotionSrc (56109403233
56380425)] CompleteOp: Vmotion task succeeded with result: ( {\r\n<166>2018-05-16T08:57:41.410Z host.domain.s
omewhere Hostd: --> vmDowntime = 4072,\r\n<166>2018-05-16T08:57:41.410Z host.domain.somewhere Hostd: --> vmPrecopyStunTime = 650021,\r\n<166>2018-05-1
6T08:57:41.410Z host.domain.somewhere Hostd: --> vmPrecopyBandwidth = 726706228\r\n<166>2018-05-16T08:57:41.410Z host.domain.somewhere Hostd: --> }\r"

How can I strip out all except the first occurence of "<166>2018-05-16T08:57:41.410Z host.domain.s
omewhere Hostd:" ?
The pattern for this is "%{SYSLOG5424PRI}%{TIMESTAMP_ISO8601:SyslogTimestamp} %{IPORHOST:Hostname} %{PROG:AppName}:"

mutate { gsub => [ "message", "<[0-9]+>.*: -->", "" ] }

You might also want to

  mutate { gsub => [ "message", "
", "" ] }

I ended up with this filter:

	    if "multiline" in [tags]
				break_on_match => true
				match => [		
				"message", "%{SYSLOG5424PRI}%{TIMESTAMP_ISO8601:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{SYSLOGPROG:syslog_program}: %{GREEDYDATA:syslog_message}"
				gsub => [ "syslog_message", "<[0-9]+>.* -->", "" ]
				gsub => [ "syslog_message", "\r", "" ]

Thanks for your help !

