[2023-07-10T13:39:12,132][WARN ][logstash.filters.json ][main][a871e8d442cc100e09c7a231768cf15707d7055b0522c9fb6fd652b249dabdb3] Error parsing json {:source=>"[@metadata][msg]", :raw=>"":"This consumer is idempotent and the file has been consumed before matching idempotentKey: amc_kafbb_def_20230621.data. Will skip this file: RemoteFile[amc_kafbb_def_20230621.data]","logger_name":"org.apache.camel.component.file.remote.SftpConsumer","thread_name":"Camel (camel-1) thread #2 - sftp:/id@server/some/files/name/abc/100/daily/node","level":"TRACE","level_value":5000,"appName":"some-app-name","applicationCode":"Marvel"}", :exception=>#<LogStash::Json::ParserError: Unrecognized token 'This': was expecting ('true', 'false' or 'null')
at [Source: (byte)"":"This consumer is idempotent and the file has been consumed before matching idempotentKey: pda_kafbb_mtl_20230621.data. Will skip this file: RemoteFile[amc_kafbb_def_20230621.data]","logger_name":"org.apache.camel.component.file.remote.SftpConsumer"
I think, some values which syslog is sending has space and some special characters( like @, space, url forwarded slashes etc) in it. which is causing the json parse failure I believe.
[msg]", :raw=>"\":\"This consumer is idempotent and the file has been consumed before matching idempotentKey: abc_def_krl_20230523.data.
:exception=>#<LogStash::Json::ParserError: Unrecognized token 'This': was expecting ('true', 'false' or 'null')
[logstash.filters.json ][main][5nmbadada-sds4264cb43a66a1224b8e440ccf439abd5626a] Error parsing json {:source=>"[@metadata][msg]", :raw=>"\":\"Call KafkaService.doPoll\",\"logger_name\":\"com.def.events.service.kafka.KafkaService\",\"thread_name\":\"Camel-camel-1-somealerts C-0\",\"level\":\"DEBUG\",\"level_value\":10000,\"Name\":\"some-alert-for-team\",\"app":\"marvel\"}", :exception=>#<LogStash::Json::ParserError: Unrecognized token 'Call': was expecting ('true', 'false' or 'null')
The message that was grokked into [@metadata][msg]
This consumer is idempotent and the file has been consumed before matching idempotentKey: abc_def_krl_20230523.data.
is not JSON. A json filter is going to raise an exception if you ask it to parse something that is not JSON. If only a subset of your syslog messages contain JSON then you can either try to parse them all and accept that you will get many exceptions, or try to write conditionals that determine whether or note [@metadata][msg] looks like JSON.
Thanks guys for your valuable input! I think I am almost there. just need to filter out the this value "\n \u0000"
[2023-07-11T15:46:45,917][WARN ][logstash.filters.json ][main][1543fdc4018abd383f8789d4050a7904179800c6a9bd317f664c8ebaa5416a91] Error parsing json {:source=>"message", :raw=>"\n {\"proxyname\":\"test-abc-def\",\"revision\":\"8\",\"latency\":31,\"startTimestamp\":1689104805142,\"endTimestamp\":1689104805142,\"verb\":\"GET\",\"cn\":\"\",\"ip\":\"10.11.12.13\",\"url\":\"https://abc.def.ghi/jkl-mno-pqr\",\"msgTimestamp\":1689104805475,\"organizationName\":\"someorg-def\",\"apigeeEnvironment\":\"dev\"}\n \u0000", :exception=>#<LogStash::Json::ParserError: Illegal character ((CTRL-CHAR, code 0)): only regular white space (\r, \n, \t) is allowed between tokens
at [Source: (byte[])"
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.