Need Help to parse specific Log format

Tdannhausen · September 21, 2018, 1:33pm

Hi,
I could use some help. I need to parse a specific log format. It looks like this:
{"properties":{"columns":["time","v0"],"db":"default","vs":"/Common/Test_10+/Common/BaDOS","metric":"base.conns","sample_rate":"1000"},"values":[
[1536828341000,0],
[1536828343000,0],
[1536828344000,0],
[1536828345000,0],
[1536828346000,0]

Every second a line is added. I would need the fields: columns, db, vs, metric, sample_rate and the values. The first entry in values is the unixtime stamp. The Server only inserts the values as new log lines. So any help on how to achieve this would be great.

jakelandis · September 21, 2018, 5:16pm

Are these files on disk ? Are you using beats or Logstash to send these files to Elasticsearch ? Can you provide an example of the data (as is, or want to be) sent to Elasticsearch ?

Every second a line is added.

Are you trying to process the lines out of JSON array ? If so, it would be much easier to process actual JSON with a reasonably sized values array.

Tdannhausen · September 21, 2018, 6:46pm

Yes these files are on Disk. I tried using filebeats but it‘s not indexing the fields. Yes the JSON Array is extended every second with a new timestamp and value. What i get at the Moment is that the file is read line for line. So the Field names are missing. What do you mean with a reasonably sized values array. How do I configure that?
Thanks

jakelandis · September 22, 2018, 4:11pm

I think using Filebeat , multi-line is probably the first step: Manage multiline messages | Filebeat Reference [6.4] | Elastic. The goal to be to enable processing a valid JSON object instead of line by line.

Once you can get fully formed JSON as the payload you have much better options with processing (if you need it). For example in Logstash, you will want to look at the split filter Split filter plugin | Logstash Reference [8.11] | Elastic to possibly create new events for each value in the values array (e.g. full JSON/Elasticsearch documents). You may also be also want to look at the ingest node's Foreach processor : Foreach processor | Elasticsearch Guide [8.11] | Elastic to accomplish the same, but would likely require some custom scripting. Both ways will likely take a bit of data wrangling to shape documents to index to meaningful way. (i.e. there probably is not an easy button for this type of log file)

What do you mean with a reasonably sized values array. How do I configure that?

You said that the values array is added to every second. I assume at some point, the system logging the event will stop adding to the current JSON and start a new JSON structure. If that happens every 1 minute, then you will have 60 values in your value array (which is very reasonable) and allows the event to be sent every 1 minute. However, if the server keeps adding new lines every 1 second for a whole day, then the size of the values arrays is starting to get unreasonable and the delay to get a fully a formed JSON value is also an unreasonable 1 day delay before processing can start. How often a new JSON is created (and how many values are in the array) is a property of the server generating the logs.

system · October 20, 2018, 4:11pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Need help with parsing json fields Logs	4	3714	April 15, 2019
Parsing Python Logs - Use Logstash or not Beats	2	3172	July 5, 2017
Logs are not getting parsed? Beats filebeat	2	593	January 15, 2017
Logstash filebeat json payload problem Logstash	3	482	March 30, 2021
Jsons and logstash Logstash	5	499	July 15, 2017

Need Help to parse specific Log format

Related topics