"hash=event['field'].to_hash" instead "hash = event.to_hash"

Saket_Kumar · June 23, 2015, 9:42am

Is it possible...

I want to traverse through events against specific field. something like:
hash = event['field1'].to_hash
hash.each { |event['field1'], v| puts event['field1'] if v == hash.values.max }

Any help?

magnusbaeck · June 23, 2015, 10:01am

The named parameters passed to the iteration block must be identifiers and can't be expressions. What's wrong with

field1_hash = event['field1'].to_hash
field1_hash.each { |k, v| puts event['field1'] if v == field1_hash.values.max }

or even

field1_hash = event['field1'].to_hash
field1_hash.each_value { |v| puts event['field1'] if v == field1_hash.values.max }

since you don't appear to care about the hash key?

Saket_Kumar · June 23, 2015, 10:40am

I am getting below exception when using it..
Exception in filterworker {"exception"=>#NoMethodError: undefined method to_hash' for 31:Fixnum, "backtrace"=>["(ruby filter code):2:in register'", "org/jruby/RubyProc.java:271:in call'", "/opt/Log/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/logstash-filter-ruby-0.1.5/lib/logstash/filters/ruby.rb:37:in filter'", "/opt/Log/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.0-java/lib/logstash/filters/base.rb:162:in multi_filter'", "org/jruby/RubyArray.java:1613:in each'", "/opt/Log/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.0-java/lib/logstash/filters/base.rb:159:in multi_filter'", "(eval):302:in filter_func'", "/opt/Log/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.0-java/lib/logstash/pipeline.rb:219:in filterworker'", "/opt/Log/logstash-1.5.0/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.0-java/lib/logstash/pipeline.rb:156:in start_filters'"], :level=>:error}

magnusbaeck · June 23, 2015, 11:07am

The field1 field obviously isn't a hash or something that can be converted to a hash, it's a numerical value.

With more information about what your messages look like and what you want in the end it'll be easier to help.

Saket_Kumar · June 23, 2015, 11:19am

Okay let me explain what I want step by step:

XML file contains following fields:
Array Fields:
xpath => ["/response/data/run/firstView/videoFrames/frame[]/time/text()","Time_FV"]
xpath => ["/response/data/run/firstView/videoFrames/frame[]/image/text()","Image_FV"]
xpath => ["/response/data/run/firstView/videoFrames/frame[*]/VisuallyComplete/text()","Progress_FV"]
and Non Array fields
xpath => ["/response/data/median/firstView/visualComplete/text()","FV_visualComplete"]
xpath => ["/response/data/median/firstView/lastVisualChange/text()","FV_lastVisualChange"]
xpath => ["/response/data/median/firstView/loadTime/text()","FV_loadTime"]
xpath => ["/response/data/median/firstView/fullyLoaded/text()","FV_fullyLoaded"]
xpath => ["/response/data/median/firstView/SpeedIndex/text()","FV_SpeedIndex"]
I have multiple XML files with above fields
I parsed them and output to ELK
When I query found file processed as a new message and for non array fields it is fine.
If you remember in my last blog i had asked about splitting the array field; it was for the same kind of processing. BTW for single file i was able achieve what i wanted from these array field.
But when you have multiple file the same config will not work. For you reference Config for Single XML File.
input {
file {
path => "/opt/Log/WebPageTestfinal1/RUN*/*_XML_WebpageSummary.xml"
start_position => "beginning"
}
}

filter {
if [message] =~ "^<?xml .*" {
drop {}
}
multiline {
pattern => "^</response>"
negate => true
what => "next"
}

xml {
source => "message"
target => "videoFrames"
store_xml => false
xpath => [

"/response/data/run/firstView/videoFrames/frame[*]/time/text()","Time_FV",
"/response/data/run/firstView/videoFrames/frame[*]/image/text()","Image_FV",
"/response/data/run/firstView/videoFrames/frame[*]/VisuallyComplete/text()","Progress_FV",

"/response/data/run/repeatView/videoFrames//frame[*]/time/text()","Time_RV",
"/response/data/run/repeatView/videoFrames/frame[*]/image/text()","Image_RV",
"/response/data/run/repeatView/videoFrames/frame[*]/VisuallyComplete/text()","Progress_RV",
		
]
	}

ruby {
	code => "
		##Finding max for splitting event for that many number of times
		
		x= [event['Time_FV'].length, event['Time_RV'].length]
		max= x.max
		if event['Time_FV'].length==max
			event['flag']='FV'
		end
		if event['Time_RV'].length==max
			event['flag']='RV'
		end
					
		"
		
    }
	if [flag]=="FV" {
		split { field => "Time_FV" } 
		} 
	if [flag]=="RV" {
		split { field => "Time_RV" } 
		}	


ruby {
	code => "
			my_variable = ENV['mycount3']
			if my_variable.nil?
				ENV['mycount3']=0.to_s
				counter=0
			else
				counter=ENV['mycount3'].to_i
				counter=counter+1
				ENV['mycount3']=counter.to_s
			end

			if event['flag'] != 'FV'
				tfv=event['Time_FV']
				event['TimeFV']=tfv[counter]
			end
			if event['flag'] != 'RV'	
				trv=event['Time_RV']
				event['TimeRV']=trv[counter]
			end
			pfv=event['Progress_FV']
			prv=event['Progress_RV']	
							
			event['ProgressFV']=pfv[counter]
			event['ProgressRV']=prv[counter]
			
			##Extracting test id, run id and url from file name	
            filename = File.basename(event['path'], '.*')
			value = filename.split('_')
			event['Test_Id'] = value[0]
			event['Run_Id'] = value[1] + '_' + value[2] + '_' + value[3]
			event['URL1'] = value[4]			
							
			"			
			}

if [flag]=="FV" {
		mutate {rename => { "Time_FV" => "TimeFV" }}
		} 
if [flag]=="RV" {
		mutate {rename => { "Time_RV" => "TimeRV" }}
		}	 
mutate {convert => ["ProgressFV", "integer"]}
mutate {convert => ["TimeFV", "float"]}
mutate {convert => ["ProgressRV", "integer"]}
mutate {convert => ["TimeRV", "float"]}




#mutate { remove_field => ["Time_FV"]}
#mutate { remove_field => ["Time_RV"]}
#mutate { remove_field => ["Progess_FV"]}
#mutate { remove_field => ["Progess_RV"]}

}

output {
elasticsearch {
action => "index"
host => "172.27.155.109"
index => "logstash-xml1%{+YYYY.MM.dd}"
workers => 1
}
stdout { codec => json }
}

indent preformatted text by 4 spaces

magnusbaeck · June 23, 2015, 12:59pm

If you remember in my last blog i had asked about splitting the array field; it was for the same kind of processing. BTW for single file i was able achieve what i wanted from these array field.
But when you have multiple file the same config will not work.

So this is what you're really asking about? If so, please explain what "will not work" means. If not, please explain what your question is.

Saket_Kumar · June 23, 2015, 1:19pm

Please excuse me for being not so clear:

trying to explain consider that i have two xml with fields
File1:

<?Median> <?visulaComplete.... <?Average> <?visulaComplete.... <?Run> <?ID <?FV.. <?time>0<?/time> <?time>200<?/time> <?RV <?time>300<?/time> File2: Is replica <?Median> <?visulaComplete.... <?Average> <?visulaComplete.... <?Run> <?ID <?FV.. <?time>0<?/time> <?time>200<?/time> <?RV <?time>300<?/time> --When parsed I get two new messages --For "visulaComplete" File it is fine and I am able to store them as I want them to be presented on Kibana Graph. -- "Time" field creates an array and stores all values in single field "Time_FV" & "Time_RV" from both parsed files. To present it on graph not seeing any distinction due to its array field. - For seeing the distinction I need to split these array values into different messages (Each split message containing "Time_FV" & "Time_RV" field along with single items from array] This is what i was trying to achieve. Basically doing data modelling for storing data in a way to draw desired graph. As per above config: using RUBY code I got success but complexity increased when having multiple files to process. Hope I am clear to you now.

Saket_Kumar · June 24, 2015, 5:14am

Can Logstash suffice my need?

magnusbaeck · June 24, 2015, 5:32am

I don't see why the number of input files would in any way matter here. What isn't working? Why is it more complex to support more files?

Saket_Kumar · June 24, 2015, 6:09am

When file path changes it overwrites the values...

magnusbaeck · June 24, 2015, 6:39am

Sorry, I don't understand what you mean. What values are overwritten?

Saket_Kumar · June 24, 2015, 9:10am

Is it possible to process each file sequentially....

I mean file1 as an input parse----filter processing ----output elasticsearch; then another file2 as an input parse----filter processing ----output elastic search

rather doing
dir path as an file input ---parse both the file----filter processing of both the file and then ---output elasticsearch.

can we control currently my config parses all files prsent in DIR:
input {
file {
path => "/opt/Log/WebPageTestfinal7/RUN*/*_XML_WebpageSummary.xml"
start_position => "beginning"
}
}

then runs processing on all messages in one go and then output to elasticsearch.

Is it possible to parse-> process->output files one by one ? If yes then how? If i parse file one by one my existing config will work.

magnusbaeck · June 24, 2015, 2:50pm

You could run separate Logstash instances to completely separate the processing.

But again, what you're describing doesn't sound normal. With more information about what values are being overwritten we can help you.

Saket_Kumar · June 25, 2015, 6:01am

Am I being so difficult to state problem...my bad.

Use case:
I have WebPageTest Results for multiple URL Performance Tests and for Multiple Runs.

I get them in Folder Structure

WebPageTest/RUN1432621877157
WebPageTest/RUN1432621608713

XML Data:

|nullxnull

Note: Please change file extension from png to xml.

I want to push them into Elasticsearch using same INDEX in such a way to get below charts at Kibana.

                            1             Line chart Visual Progress (FV & RV)
                            2             Bar chart Timings (FV & RV) (Median)
                                                            -Visually Complete 
                                                            -Last Visual Change
                                                            -Load Time (onload)
                                                            -Load Time (Fully Loaded)
                                                            -Speed Index
                                                            -Time to First Byte
                                                            -Time to Title
                            3             Score Board | bar Chart (FV &RV) (Median)
                                            <?score_cache>71</?score_cache>
                                            <?score_cdn>28</?score_cdn>
                                            <?score_gzip>100</?score_gzip>
                                            <?score_cookies>-1</?score_cookies>
                                            <?score_keep-alive>100</?score_keep-alive>
                                            <?score_minify>-1</?score_minify>
                                            <?score_combine>100</?score_combine>
                                            <?score_compress>-1</?score_compress>
                                            <?score_etags>-1</?score_etags>
                            4             Bar chart on CPU consumtion (FV &RV) (Median)
                                            <?docCPUms>3057.62</?docCPUms>
                                            <?fullyLoadedCPUms>4976.432</?fullyLoadedCPUms>
                                            <?docCPUpct>85</?docCPUpct>
                                            <?fullyLoadedCPUpct>70</?fullyLoade

"For chart 2 to 4 Logstash worked perfectly to parse my data and push to ELK"

Using Below Config.
|nullxnull

Note: Please change file extension from png to conf.

In attached XML Time FirstView & RepeatView data are coming from array elements hence when parsing all values from respective fields being stored as list in ELK.
Being in List not sufficing desired chart as expecting variations of Progress against time for both FirstView & RepeatView.

For Data Modelling I need to parse the RUN files one by one which i can see as work around rather parsing all run files and processing them and pushing to ELK at once.

Is there any suggestions for handling/processing mutiple XML files or Best Practice.

Thanks Saket.

Saket_Kumar · June 25, 2015, 6:02am

Is there any way to share files with you for reference and better understanding of problem.

magnusbaeck · June 25, 2015, 6:06am

You're still not explaining what you mean when you say that values are being overwritten. Sorry, I'm out of patience here. Maybe someone else help.

Topic		Replies	Views
How to iterate over all fields with new Event API Logstash	2	2512	January 17, 2018
Ruby filter loop through fields in logstash 5? Logstash	3	3247	February 17, 2017
Iteration in Logstash Logstash	5	14164	July 6, 2017
Logstash 6.0 breaks ruby event iteration Logstash	5	2832	December 25, 2017
7y8ui9o Logstash	2	359	March 22, 2018

"hash=event['field'].to_hash" instead "hash = event.to_hash"

Related topics