Parsing xml log problem


(Daniele) #1

Hi,
i'm trying to parse an xml log (mysql audit log) but i'm having problem on how the otput is done.
My goal is to parse an xml and write it as is, but seems that logstash does not read/write input file in it's original sequence. I'm using logstash 6.4.0 on Windows.
My pipeline:

`input {
  file {
	  path => "C:/Users/lite/Desktop/work/input/audit_log.xml"
	  start_position => "beginning"
	  sincedb_path => "NUL"
	  type => "audit_file"
	}
}

output {
	stdout { codec => rubydebug }
	file { 
		path => "C:/Users/lite/Desktop/work/output/output.xml"
		codec => line {
			format => "%{[message]}"
		}
	}
}`

Input xml log file:

`<?xml version="1.0" encoding="UTF-8"?>
<AUDIT>
 <AUDIT_RECORD>
  <TIMESTAMP>2018-09-11T15:01:45 UTC</TIMESTAMP>
  <RECORD_ID>1_2018-09-11T15:01:45</RECORD_ID>
  <NAME>Audit</NAME>
  <SERVER_ID>1</SERVER_ID>
  <VERSION>1</VERSION>
  <STARTUP_OPTIONS>C:\Program Files\MySQL\MySQL Server 8.0\bin\mysqld.exe --defaults-file=C:\ProgramData\MySQL\MySQL Server 8.0\my.ini</STARTUP_OPTIONS>
  <OS_VERSION>x86_64-Win64</OS_VERSION>
  <MYSQL_VERSION>8.0.12-commercial</MYSQL_VERSION>
 </AUDIT_RECORD>
 <AUDIT_RECORD>
  <TIMESTAMP>2018-09-11T15:11:05 UTC</TIMESTAMP>
  <RECORD_ID>2_2018-09-11T15:01:45</RECORD_ID>
  <NAME>NoAudit</NAME>
  <SERVER_ID>1</SERVER_ID>
 </AUDIT_RECORD>
  <AUDIT_RECORD>
  <TIMESTAMP>2018-09-11T15:01:45 UTC</TIMESTAMP>
  <RECORD_ID>1_2018-09-11T15:01:45</RECORD_ID>
  <NAME>Audit</NAME>
  <SERVER_ID>1</SERVER_ID>
  <VERSION>1</VERSION>
  <STARTUP_OPTIONS>C:\Program Files\MySQL\MySQL Server 8.0\bin\mysqld.exe --defaults-file=C:\ProgramData\MySQL\MySQL Server 8.0\my.ini</STARTUP_OPTIONS>
  <OS_VERSION>x86_64-Win64</OS_VERSION>
  <MYSQL_VERSION>8.0.12-commercial</MYSQL_VERSION>
 </AUDIT_RECORD>
 <AUDIT_RECORD>
  <TIMESTAMP>2018-09-11T15:11:05 UTC</TIMESTAMP>
  <RECORD_ID>2_2018-09-11T15:01:45</RECORD_ID>
  <NAME>Audit</NAME>
  <SERVER_ID>2</SERVER_ID>
 </AUDIT_RECORD>
</AUDIT>`

output xml:

`<MYSQL_VERSION>8.0.12-commercial</MYSQL_VERSION>
 <AUDIT_RECORD>
  <RECORD_ID>2_2018-09-11T15:01:45</RECORD_ID>
  <SERVER_ID>2</SERVER_ID>
</AUDIT>
<AUDIT>
 <AUDIT_RECORD>
  <TIMESTAMP>2018-09-11T15:01:45 UTC</TIMESTAMP>
  <RECORD_ID>1_2018-09-11T15:01:45</RECORD_ID>
  <NAME>Audit</NAME>
  <SERVER_ID>1</SERVER_ID>
  <VERSION>1</VERSION>
  <STARTUP_OPTIONS>C:\Program Files\MySQL\MySQL Server 8.0\bin\mysqld.exe --defaults-file=C:\ProgramData\MySQL\MySQL Server 8.0\my.ini</STARTUP_OPTIONS>
  <OS_VERSION>x86_64-Win64</OS_VERSION>
  <MYSQL_VERSION>8.0.12-commercial</MYSQL_VERSION>
 </AUDIT_RECORD>
 <AUDIT_RECORD>
  <TIMESTAMP>2018-09-11T15:11:05 UTC</TIMESTAMP>
  <RECORD_ID>2_2018-09-11T15:01:45</RECORD_ID>
  <NAME>NoAudit</NAME>
  <SERVER_ID>1</SERVER_ID>
 </AUDIT_RECORD>
  <AUDIT_RECORD>
  <TIMESTAMP>2018-09-11T15:01:45 UTC</TIMESTAMP>
  <RECORD_ID>1_2018-09-11T15:01:45</RECORD_ID>
  <NAME>Audit</NAME>
  <SERVER_ID>1</SERVER_ID>
  <VERSION>1</VERSION>
  <STARTUP_OPTIONS>C:\Program Files\MySQL\MySQL Server 8.0\bin\mysqld.exe --defaults-file=C:\ProgramData\MySQL\MySQL Server 8.0\my.ini</STARTUP_OPTIONS>
  <OS_VERSION>x86_64-Win64</OS_VERSION>
 </AUDIT_RECORD>
  <TIMESTAMP>2018-09-11T15:11:05 UTC</TIMESTAMP>
  <NAME>Audit</NAME>
 </AUDIT_RECORD>
<AUDIT>
  <TIMESTAMP>2018-09-11T15:01:45 UTC</TIMESTAMP>
  <NAME>Audit</NAME>
  <VERSION>1</VERSION>
  <OS_VERSION>x86_64-Win64</OS_VERSION>
 </AUDIT_RECORD>
  <TIMESTAMP>2018-09-11T15:11:05 UTC</TIMESTAMP>
  <NAME>NoAudit</NAME>
 </AUDIT_RECORD>
  <TIMESTAMP>2018-09-11T15:01:45 UTC</TIMESTAMP>
  <NAME>Audit</NAME>
  <VERSION>1</VERSION>
  <OS_VERSION>x86_64-Win64</OS_VERSION>
 </AUDIT_RECORD>
  <TIMESTAMP>2018-09-11T15:11:05 UTC</TIMESTAMP>
  <NAME>Audit</NAME>
 </AUDIT_RECORD>
<?xml version="1.0" encoding="UTF-8"?>
 <AUDIT_RECORD>
  <RECORD_ID>1_2018-09-11T15:01:45</RECORD_ID>
  <SERVER_ID>1</SERVER_ID>
  <STARTUP_OPTIONS>C:\Program Files\MySQL\MySQL Server 8.0\bin\mysqld.exe --defaults-file=C:\ProgramData\MySQL\MySQL Server 8.0\my.ini</STARTUP_OPTIONS>
  <MYSQL_VERSION>8.0.12-commercial</MYSQL_VERSION>
 <AUDIT_RECORD>
  <RECORD_ID>2_2018-09-11T15:01:45</RECORD_ID>
  <SERVER_ID>1</SERVER_ID>
  <AUDIT_RECORD>
  <RECORD_ID>1_2018-09-11T15:01:45</RECORD_ID>
  <SERVER_ID>1</SERVER_ID>
  <STARTUP_OPTIONS>C:\Program Files\MySQL\MySQL Server 8.0\bin\mysqld.exe --defaults-file=C:\ProgramData\MySQL\MySQL Server 8.0\my.ini</STARTUP_OPTIONS>
  <MYSQL_VERSION>8.0.12-commercial</MYSQL_VERSION>
 <AUDIT_RECORD>
  <RECORD_ID>2_2018-09-11T15:01:45</RECORD_ID>
  <SERVER_ID>2</SERVER_ID>
</AUDIT>`

Any ideas on how to adjust output in order to be same as input?
Thank You in advance!


(Magnus Bäck) #2

You want to parse an XML file but then write the original file to a new location? If you set the number of pipeline workers to 1 (one) your configuration should work okay, but I suggest you tell us more about your use case. There might be a better way of solving the underlying problem.


(Daniele) #3

Thank you magnusbaeck, for example i need to parse the xml and remove this part: <?xml version="1.0" encoding="UTF-8"?>; but the output xml is not same as the input one, as described in my #1 post.


(Magnus Bäck) #4

Oh, I looked too quickly and thought the the problem only was that the order of the lines had been mixed up.

Given that the order is screwed up it's hard to figure out what kind of transform you want to apply, but it sounds like something that should be done with XSLT. I don't think Logstash is the right tool for what you're trying to do.

for example i need to parse the xml and remove this part: <?xml version="1.0" encoding="UTF-8"?>

Why on earth for?


(Daniele) #5

Thank you magnusbaeck, your advice resolved my problem.


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.