How to collect a custom metric using metricbeat


#1

I have to create a sampled time series data by running a command( by application on RH 7 Linux) at interval of 5minutes that will generate a output. Now i want to tag this time ( of fixed interval of 5 mins) to the data. For example 2019-03-14 09:00:00 cmd_output_data1, 2019-03-14 09:05:00 cmd_output_data2... like that..
In logstash, I have to calculate the difference in time between two successive data to know the number of samples. So my requirement is to insert the very accurate time_stamp i.e with precision of 00 seconds like 09:00:00, 09:05:00, 09:10:00.. so that if i calculate the samples it will be diff( 09:10:00 - 09:00:00 ) = 15/5 = 3 samples..

I understand that metricbeat should be used for collecting a time sample data but it works only with available module embedded in it. My requirement is for custom application metric collection as explained above. Am I missing something?


(ruflin) #2

Welcome @msk_76

Can you share a bit more about the command / script that is run to create the metrics output? Metricbeat has some more generic modules / inputs like http which you can point at any endpoint. Perhaps we can do something similar here.


#3

The output of command will be six space seperated columns :
col1 col2 col3 col4 col5 col6 and the additional col7 for time of run of command like 2019-03-14 09:00:00.

Col1 is username ( string), col2 is feature_name(string) and col5 is numeric value of usage of feature by username and col6 is also numeric which is total number of feature_name available.
During transformation i have to add another column say col8 having sum of usage(by all users) of given feature at a given timestamp in front of each row of data.

Hope I am able to explain the context.


(ruflin) #4

An alternative idea to make this happen would be writing this data to a log file, then read this log file with Filebeat and use dissect to split up the values. If you have all the raw values, I assume you can do all the calculations in real time on the Elasticsearch side.


#5

I tried writing to log file but the issue is that if I keep the time_stamp of running command for datacollection, it doesn’t maintain equal interval. For example
in one case it keep 09:00:00 and in next sample of 5 mins it records 09:05:02 and third 09:10:01. So there is difference of few seconds gets recorded in log file and I don’t get a equal difference in time.