How to find out which logshatsh filter is best for your data

Hi,

I am new with ELK, I start learning with ELK and saw some issue with my own data. That time I realized that I am using the wrong filter that is why I am not getting desirable output.

This is the sample of my data, Anyone can tell which filter I have to use

35858786,5-17-18 19:54,RA05,2200110035,2.3.6,10,30
35858787,5-17-18 19:54,RL26,2200110037,2.3.6,ERR,78
35858788,5-17-18 19:54,RA05,2200110037,2.3.6,20,78
35858789,5-17-18 19:54,RA05,1100110274,2.3.5,20,15
35858790,5-17-18 19:54,RB03,1100110192,2.2.8,5B00504CE3A4,20.3
35858792,5-17-18 19:54,RV02,1100110085,2.3.6,02004931057F,OFF
35858794,5-17-18 19:54,RB03,4400110018,2.3.6,3F001AC5FF1F,3
35858796,5-17-18 19:54,RV02,4400110029,2.3.6,5A009954B92E,OFF
35858798,5-17-18 19:54,RA05,2200110035,2.3.6,10,30
35858799,5-17-18 19:54,RA05,1100110081,2.3.6,20,110
35858800,5-17-18 19:54,RA05,1100110251,2.3.a,2,50
35858801,5-17-18 19:54,RB03,4400110031,2.3.6,5A009966FB5E,10
35858803,5-17-18 19:54,RA05,2200110035,2.3.6,10,30
35858804,5-17-18 19:54,RV02,1100110192,2.2.8,1E002CE5E83F,OFF
35858806,5-17-18 19:54,RB03,4400110015,2.3.6,51004D96FB71,6.6
35858808,5-17-18 19:54,RB03,1100110107,2.3.1,CHINTPURNI07,0.2
35858810,5-17-18 19:54,RA05,1100110107,2.3.1,10,30
35858811,5-17-18 19:54,RA05,1100110189,2.3.5,20,81
35858812,5-17-18 19:54,RV02,1100110285,2.3.5,1E002D151E38,OFF/>

It depends on the expected outcome, but the easiest way to split this data into multiple fields is to use a csv filter.

@magnusbaeck I used CSV filter but I find dateparsefailure this error for some logs, maybe because some string size was not same or they have null also.

I would like to recognize the pattern so now what I should do?

input {
	file {
			type => logs
			path => ["D:\Ambuj\type 0\118\May_2018\Client_4_May_2018_16_31.csv"]
			start_position => "beginning"
		}
}
filter {
		csv{
				columns => ["log_id","date","msg_string","ID","Version","data","List"]
				separator => ","
				}
			date {
					match => ["date", "MM-dd-yy hh:mm"]
					target => "date"
				}
			mutate {
				convert => { 
					"log_id" => "integer" 
					"ID" => "integer"
				}
			}
}
output {
		elasticsearch {
			hosts => "localhost"
			index => "error-5"
  }
  stdout {codec => rubydebug}
}

When the date filter fails it'll log a message that points to where the problem is.

For some logs date filter work perfectly and for some logs it wouldn't. And for pattern recognizing which filter is would work for me

For some logs date filter work perfectly and for some logs it wouldn't.

Sure, that's a possibility. Again, check the Logstash log for clues.

And for pattern recognizing which filter is would work for me

What pattern recognition do you need?

@magnusbaeck For logstash log path :

~path\logstash\log

Pattern: I would like to find time to fail pattern

One more problem I am facing once I parse the data from logstash and create the index pattern after some time data size will decrease and after reboot the system I have again purse the data, what I am doing wrong , I should wait till all data parse?

Pattern: I would like to find time to fail pattern

I have no idea what you mean by that.

this is the log of machine, every log related to some sensor so I want to know when this machine will stop working, what kind of pattern it has

So you want Logstash to analyze the values emitted by a sensor and correlate that with failures of some kind? Logstash doesn't have that kind of analysis capabilities built-in but it should be able to help you parse files to extract the sensor values.

No.. I dont want logstash to analyze data. I want to know which filter is best for me to find pattern weather CSV of gork or something else so I can do what I desire (Time to fail patter).

If I am using proper filter to parse data into ES that will give me freedom to analyze logs. Parsing data in ES is first step. I just want to do that right

I want to know which filter is best for me to find pattern weather CSV of gork or something else so I can do what I desire (Time to fail patter).

If you give an example of the input data you have and the desired result it'll be much easier to help. I've looked at your CSV example at the top but I have no idea how to extract any "time to fail" stuff from it.

@magnusbaeck Thanks man for helping me here. really appreciated

In my very first post I post the input data. If you still require more on let me know I ll share the data with you

I'm losing my patience here. Yes, as I said I know what the input data looks like, but what is Logstash supposed to do with it? If you can't describe with words and examples what you want to accomplish it's impossible for us to help you translate that into a Logstash configuration.

@magnusbaeck

my device suppose to send RA05 string on specific time period but after some time randomly (may be 15 days or longer may be 2 hours) it will stop sending RA05 string. So I want to analyze what kind of pattern here is, in the mean while regular work is going on and every log would be there.

May be possible, there is some kind of pattern that we cant see, some sensor is tripping or sending some data that it will cause non working of machine, I want to relate those instance.

Even possible sensor A sending some data and then sensor B sending some data than machine stop working so is there any relation I want to find out that.

Time line may be there or may be only sensor data or may be something else.

Note : how to find machine is working or even in idle condition, that time RA05 is there with respective period than machine working otherwise we have to count that instance a Time to fail instance.

As I said, Logstash doesn't have that kind of analysis capabilities built-in but it should be able to help you parse files to extract the sensor values. Over and out.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.