Send Ubuntu Syslogs to Azure Event Hub with Logstash - json issue

Situation: Using Logstash to forward ubuntu (azure vm) syslogs to an azure event hub.
Problem: Using "json" or "json_batch" results in adding the _jsonparseerror tag to events because the syslog isn't in json format. Is there an input/filter/output plugin or other strategy that will help me send multiple linux syslog events as json_batch in the following structure?:

[
  {
    "records": [
		{
			r1
		},
		{
			r2
		},		
		...
		{
			rn
		}
	]
  }
]
    

~$: cat /var/syslog shows what the logs look like. I cron job running a test script every minute to generate log data:

Nov  1 00:04:01 ubuntu-ls-test CRON[3727]: (CRON) info (No MTA installed, discarding output)
Nov  1 00:05:01 ubuntu-ls-test CRON[3737]: (sshadmin) CMD (/home/sshadmin/test.sh)
Nov  1 00:05:01 ubuntu-ls-test CRON[3736]: (CRON) info (No MTA installed, discarding output)
Nov  1 00:06:01 ubuntu-ls-test CRON[3747]: (sshadmin) CMD (/home/sshadmin/test.sh)

adding stdout {codec => rubydebug} to the output plugin shows the log is read/parsed like this:

"event" => {
        "original" => "<78>Nov  1 17:07:01 ubuntu-ls-test CRON[7626]: (CRON) info (No MTA installed, discarding output)\n"
    }


"event" => {
        "original" => "<78>Nov  1 17:07:01 ubuntu-ls-test CRON[7628]: (sshadmin) CMD (/home/sshadmin/test.sh)\n"

My guess is that because, under the hood, the syslog starts with '<' which isn't expected if this is to be read as json (perhaps?). If that's the case, how would one go about the conversion (a grok pattern? I have tried a few and kept getting the _grokpatternfailure tag. Currently not using a filter plugin).

My input plugin looks like this:

input {
  syslog {
    type => "syslog"
    port => "5514"
  }
}

I have 2 working output plugins that forward the data to an Azure Event Hub:

output { 

    azure_event_hubs {
        service_namespace => "EventHubNamespace" #Exclude .servicebus.windows.net
        event_hub => "myEventHub"
        sas_key_name => "RootManageSharedAccessKey"
        sas_key => '##########################='
        properties_bag => {
                "Format" => "json"
        }
    }

}

And this one:


output { 
 http {
  url =>    
   "https://EventHubNamespace.servicebus.windows.net/MyEventHub/messages"
  content_type => 
   "application/x-www-form-urlencoded"
  http_method => 
   "post"
  format => 
   "json_batch"
  message =>
   '%{@timestamp} : %{message}'
  headers => {
    "Host" => "EventHubNamespace.servicebus.windows.net"
    "Authorization" => "#####(SAS Key)###############"
    }
  }
}

While both have json formats included, my event hub can read the data as plain but not in desired format.

Has anyone successfully done this?

Thanks in advance for any guidance.

So some data is generated by the event hub, for example EventProcessedUtcTime that cannot be eliminated until we package in records (noted above).

There an easy way to do this from the logstash side in the config?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.