Filebeat behavior

Jemli_Fathi · July 10, 2016, 11:10pm

I have an XML file which is gonna be updated arbitrary by another program by appending new documents each time...
This file also will be initialized every day by depopulating it.

I configured filebeat to catch every XML document inside this file matching this format <H_Ticket>...</H_Ticket> using this configuration:
> filebeat:

      # List of prospectors to fetch data.
      prospectors:
          paths:
            - C:\busesdata\*.xml
          input_type: log
          exclude_lines: ["^.*xml"]
          #ignore_older: 10s
          #close_older: 1h
          document_type: ticket
          scan_frequency: 15s
          multiline:
            pattern: '<H_Ticket'
            negate: true
            match: after
    output:
      ### Logstash as output
      logstash:
        hosts: ["localhost:5044"]
        index: filebeat

It works very well when adding many XML docs at the end of the file, but it sends an empty event when adding a single document, for example:

<H_Ticket>ticket1</H_Ticket> <H_Ticket>ticket2</H_Ticket>

=> it works properly

<H_Ticket>ticket</H_Ticket>

=> empty event

First, Is this behavior is due to a wrong multiline or other miss configuration or what?
Second, in my case, do I have to use ignore_older and close_older params to guarantee a smooth pipeline process or not? if yes how it might be set in my case?

Thank you in advance

ruflin · July 12, 2016, 8:30am

Do you have a new line at the end of the single event? I'm somehow surprised that an empty event is sent. Be aware that multiline.timeout: 5s will apply for the last event in a file as long as no new event is added.

Are the events appended to the file identical for single or combined events?

What do you mean by "initialized"? Is the same file truncated or deleted and a new one with the same name is created?

Jemli_Fathi · July 12, 2016, 5:48pm

Thank you

I mean by empty event an event generated by filebeat like this: {}
I don't know about multiline.timeout option, is this a new configuration option?

Yes, the events are identical, but always the first appended event is parsed as an empty event by filebeat when adding 1 or more events to the file.

I mean by initialized, that the same file will be totally depopulated.

ruflin · July 13, 2016, 6:33pm

Is the full event just {} or the message? Because what should be always sent is for example the timestamp and some basic beat info.

Here you find all the docs for multiline and also timeout: https://www.elastic.co/guide/en/beats/filebeat/1.2/configuration-filebeat-options.html#multiline I thought timeout exists since multiline was introduced.

Sorry to ask again, but depopulated = truncated the file = remove all content inside the file?

Can you share 2 full events? That will make it easier to see if there is perhaps something wrong with multiline or exclude_lines. Did you ever remove exlude_lines and check if everything works as expected?

Jemli_Fathi · July 15, 2016, 10:55pm

Sorry, I'm an ES newbie, I mean by depopulated, that the program will remove all content inside the XML file.

Following is a sample content from the XML file:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?xml-stylesheet href="ticket.xsl" type="text/xsl"?>
<HF_DOCUMENT>
	<H_Ticket>
		<IDH_Ticket>31</IDH_Ticket>
		<CodeBus>186</CodeBus>
		<CodeCh>5531</CodeCh>
		<CodeConv>5531</CodeConv>
		<Codeligne>12</Codeligne>
		<Date>20151217</Date>
		<Heur>1214</Heur>
		<NomFR1>SOUK AHAD</NomFR1>
		<NomFR2>CHOTT MERIEM </NomFR2>
		<Prix>0.8</Prix>
		<IDTicket>31</IDTicket>
		<CodeRoute>107</CodeRoute>
		<origine>01</origine>
		<Distination>09</Distination>
		<Num>1</Num>
		<Ligne>107</Ligne>
		<requisition> </requisition>
		<voyage>0</voyage>
		<faveur> </faveur>
	</H_Ticket>
	<H_Ticket>
		<IDH_Ticket>32</IDH_Ticket>
		<CodeBus>186</CodeBus>
		<CodeCh>5531</CodeCh>
		<CodeConv>5531</CodeConv>
		<Codeligne>12</Codeligne>
		<Date>20151217</Date>
		<Heur>1214</Heur>
		<NomFR1>SOUK AHAD</NomFR1>
		<NomFR2>SOVIVA </NomFR2>
		<Prix>0.66</Prix>
		<IDTicket>32</IDTicket>
		<CodeRoute>107</CodeRoute>
		<origine>01</origine>
		<Distination>07</Distination>
		<Num>2</Num>
		<Ligne>107</Ligne>
		<requisition> </requisition>
		<voyage>0</voyage>
		<faveur> </faveur>
	</H_Ticket>
	<H_Ticket>
		<IDH_Ticket>33</IDH_Ticket>
		<CodeBus>186</CodeBus>
		<CodeCh>5531</CodeCh>
		<CodeConv>5531</CodeConv>
		<Codeligne>12</Codeligne>
		<Date>20151217</Date>
		<Heur>1215</Heur>
		<NomFR1>SOUK AHAD</NomFR1>
		<NomFR2>KANTAOUI </NomFR2>
		<Prix>0.66</Prix>
		<IDTicket>33</IDTicket>
		<CodeRoute>107</CodeRoute>
		<origine>01</origine>
		<Distination>06</Distination>
		<Num>1</Num>
		<Ligne>107</Ligne>
		<requisition> </requisition>
		<voyage>0</voyage>
		<faveur> </faveur>
	</H_Ticket>
</HF_DOCUMENT>

ruflin · July 20, 2016, 2:27pm

Sorry for the late reply. Just to be sure. You don't want the full event in one document which is between <HF_DOCUMENT but each sub entry in <H_Ticket>.... I assume every even starts like this, so the first three lines and last line should never be sent?

DavidL · July 22, 2016, 4:36pm

I hope your problem is solved, just a suggestion, can we have a more detailed subject line for question topic in the future? so people have a better chance finding what they need when searching, and they don't create duplicate topics.

system · July 31, 2016, 11:11pm

This topic was automatically closed after 21 days. New replies are no longer allowed.

Topic		Replies	Views
Filebeat not seperating events Beats	6	531	March 8, 2017
Filebeat not publishing multi-line event after timeout Beats filebeat	3	740	July 23, 2018
Filebeat - multiline: Ingest XML's without line feed at end of file Beats filebeat	7	3359	October 16, 2017
Filebeat process multi-line AND one-line XML Beats filebeat	4	1527	August 27, 2018
Urgent help needed : Filebeat xml question Beats filebeat	2	746	May 18, 2017

Filebeat behavior

Related topics