Hi everyone!
I have the following file I need to process in logstash:
{
"DataChangeInfo" : "Archive Log Set archiveLogSet.25933761.25933688 Info:
Thread# Sequence# FirstScn LastScn",
"documentsList" : [
{
"commandScn": "25933758",
"commandCommitScn": "0",
"commandSequence": "3",
"commandType": "INSERT",
"commandTimestamp": "2017-12-07 05:09:54+03:000",
"objectDBName": "DB4",
"objectSchemaName": "CFTNAXDEV",
"objectId": "NEWJOURNAL",
"changedFieldsList": [
{
"fieldId": "PK_NEWJOURNAL",
"fieldType": "NUMBER",
"fieldValue": "NULL",
"fieldChanged": "Y"
},
{
"fieldId": "OFFICE",
"fieldType": "CHAR",
"fieldValue": "NULL",
"fieldChanged": "Y"
},
{
"fieldId": "USERNAME",
"fieldType": "VARCHAR2",
"fieldValue": "NULL",
"fieldChanged": "Y"
},
{
"fieldId": "TERMINALID",
"fieldType": "VARCHAR2",
"fieldValue": "dajdajljdaljda,",
"fieldChanged": "Y"
},
{
"fieldId": "MODULEID",
"fieldType": "NUMBER",
"fieldValue": "NULL",
"fieldChanged": "Y"
},
{
"fieldId": "VERSION",
"fieldType": "VARCHAR2",
"fieldValue": "NULL",
"fieldChanged": "Y"
},
{
"fieldId": "DESCRIPTION",
"fieldType": "VARCHAR2",
"fieldValue": "NULL",
"fieldChanged": "Y"
},
{
"fieldId": "MID",
"fieldType": "CHAR",
"fieldValue": "CITIUS33XXX ",
"fieldChanged": "Y"
},
{
"fieldId": "STATUS",
"fieldType": "VARCHAR2",
"fieldValue": "NULL",
"fieldChanged": "Y"
},
{
"fieldId": "REFERENCE",
"fieldType": "VARCHAR2",
"fieldValue": "NULL",
"fieldChanged": "Y"
},
{
"fieldId": "ACTIONID1",
"fieldType": "NUMBER",
"fieldValue": "NULL",
"fieldChanged": "Y"
},
{
"fieldId": "ACTIONID2",
"fieldType": "NUMBER",
"fieldValue": "NULL",
"fieldChanged": "Y"
},
{
"fieldId": "TIME_STAMP",
"fieldType": "CHAR",
"fieldValue": "07-12-2017 23:11:48 ",
"fieldChanged": "Y"
},
{
"fieldId": "CLASS",
"fieldType": "CHAR",
"fieldValue": "NULL",
"fieldChanged": "Y"
},
{
"fieldId": "PRICING_WEIGHT",
"fieldType": "NUMBER",
"fieldValue": "NULL",
"fieldChanged": "Y"
},
{
"fieldId": "FIELD_ID",
"fieldType": "NUMBER",
"fieldValue": "4094399774",
"fieldChanged": "Y"
},
{
"fieldId": "FAULT",
"fieldType": "CHAR",
"fieldValue": "NULL",
"fieldChanged": "Y"
},
{
"fieldId": "AUDIT_SUBTYPE",
"fieldType": "VARCHAR2",
"fieldValue": "NULL",
"fieldChanged": "Y"
},
{
"fieldId": "ERROR_SEVERITY",
"fieldType": "NUMBER",
"fieldValue": "NULL",
"fieldChanged": "Y"
},
{
"fieldId": "HIT_INFO",
"fieldType": "VARCHAR2",
"fieldValue": "NULL",
"fieldChanged": "Y"
},
{
"fieldId": "ZONE_CODE",
"fieldType": "VARCHAR2",
"fieldValue": "NULL",
"fieldChanged": "Y"
},
{
"fieldId": "IP_ADDRESS",
"fieldType": "VARCHAR2",
"fieldValue": "NULL",
"fieldChanged": "Y"
},
{
"fieldId": "ERROR_CODE",
"fieldType": "NUMBER",
"fieldValue": "NULL",
"fieldChanged": "Y"
},
{
"fieldId": "LOG_DATE",
"fieldType": "TIMESTAMP(6)",
"fieldValue": "2017-12-07 05:09:54.941+03:000",
"fieldChanged": "Y"
},
{
"fieldId": "LOG_DATE2",
"fieldType": "TIMESTAMP(6) WITH TIME ZONE",
"fieldValue": "2017-12-06 22:09:54.941-05:00",
"fieldChanged": "Y"
}
],
"conditionFieldsList": []
}
]
}
I tried a few configurations, lets take this one for example:
input{
file{
path => /path/to/file
start_position => beginning
}
}
filter{
json{
source => "message"
target => "json_msg"
}
}
output{
stdout{codec => rubydebug}
}
The first issue I had to deal with is that in the third line of the data (starting with "Thread..") logstash did not recognized it as a part of the line above, even though its a valid JSON structure. Instead it thinks its another line and the json parser throws a json_parser failure. How to fix it?
I tried using multiline with the pattern ^Thread and it didn't work.
I continued by fixing ^ manualy, putting the two lines together.
When I run logstash I see that it's processing the data one line at a time, instead of as a json file.
like so:
I tried using json_lines as a codec in the input, to no avial.
What is the problem? How can I fix it?
I tried using multiline with the patterns: \s,\s+,\t,\t+,[\s\t].... nothing helped.
I tried to minify the json online, like in this json minifier site, and insert it with stdin (instead of file > path in the input) and everything work fine.
like so:
With is, my question is: is there a way for me to "minify" the json inserted to logstash via the file plugin in the input if the issues above can't be solved?
I need to mention that the logstash logs are fine, no errors are shown.
To sum up my questions:
- How can I make logstash understand that line 2 and line 3 (that start with the word "Thread") belong together, as it is in the json structure (there is no "," between the sentences).
- How to make logstash parse all the lines together and not one at a time? how can I make it understand that this is a json object? Using json filter (in the filter) and codec (in the input) didn't help, and so did the json_lines codec.
- If 2 can't be solved easly, how can I minify the json data in a similar way to the way shown in the json minifier site? that seems to be the only way to make it work.
Any help would be appreciated!