Hello,
(Apologies for any faux pas but first time posting in any forum).
I'm pulling an event log stored in a table from an Oracle DB via a JDBC connection. A simplistic example of the data:
EVENTID USER TYPE MESSAGE DATE
1234 BOB LOGIN SUCCESS 4-OCT-2016 12:01:01
1234 BOB BROWSER IE8 4-OCT-2016 12:01:01
1235 JANE LOGIN SUCCESS 4-OCT-2016 12:01:01
1235 JANE BROWSER IE8 4-OCT-2016 12:01:01
User BOB logs in at 12:01 on the 4th and uses IE8 to do so.
User JANE logs in at 12:01 on the 4th and uses IE8 to do so also.
The column datatype in the Oracle table is DATE not TIMESTAMP so i don't have fractional seconds to separate out events, therefore i would like to use the EVENTID which will always be unique for each event.
I'm aiming to create a single document in Elasticsearch for each event like so;
eventid:1234
userid:bob
type: login
result: success
browser: IE8
@timestamp: event date
During testing if i have one event in a table, with multiple rows, then my logstash conf works perfectly. It merges the multiline input from the JDBC and creates a single document as I expect. However when I have more than 1 event, like the data above, the multiline filter isn't breaking the input by the eventid, but taking everything as a single event, like so:
"eventid" => [
[0] 1234,
[1] 1235
...
Here is my multiline filter:
multiline {
pattern => "^%{NUMBER}"
what => "next"
negate=> true
allow_duplicates => false
}
All of the examples online use multiline with a timestamp and the following lines are preceded by white space or something equally easy to handle.
So my questions is:
How do I get the multiline filter or codec to group lines by the eventid value, not just the pattern of the value.
Many thanks for making the time to read, I hope you can help.
Regards,
James