I am facing issues with logstash pipelines filter section i am using multiple filters like grok, elasticsearch translate, ruby etc in combination, The issue is the data of some other record processed at the around the same time overwrites the field values and sometimes adds exception in tags but when the records processed individually filter section works fine with no exceptions and data overwritten.
does multiple pipelines use the same JVM, are ruby variables shared between two events.
are there any tools recommended to debug the issue.
Yes, pipelines in the same logstash instance run in the same JVM. Where variables in ruby filters are shared depends on the scope with which you declare them. Global variables ($name) are shared globally. You should never need that. Class variables (@@name) are shared across all ruby filters. It is not every year I need a class variable and I write a lot more ruby filters that most folks. Instance variables (@name) are shared by different events within a single ruby filter. Sometimes I use instance variables but most of the time I just use regular variables (name) that just exist for the duration of the execution of the code block for an event.
in my pipeline i have used simple names for example
ruby {
code => "
#field2 value i am getting from grok filter
field = event.get('field2');
event.set('custom_name',filed[0..10])
"
}
but with multiple pipelines i can see field2 value is different but the value getting assigned to field 'custom_name' is different its getting overwritten with some other records value.
Crawl the data from csv and copy one field into another by using ruby variable .but looks like data is getting overwritten from somewhere.
I followed the below steps
created 2 pipeline with same filter and input section only chnaged the index in output section
I ran both the pipelines at the same time from different command prompts.
and checked the variable cpystat , from code cpystat is set to '1' if copied field and origional field is same else it is set to '0' and surprisingly i found that around 170 records did not copy the field correctly.
Looks like the issue here is the variable case, if i change variable from TestVar to testVar then its not causing any issues, even with multiple pipelines.
still i would like to understand what is the issue here with capital case variable name in logstash ruby code.
The pipeline configurations you show supply column names for six columns, so if you were running them then event.get('column6') would return nil. So it appears you are not doing what you say you are doing.
Ummm...i also tried that, but for some reason, it is considering that column list as one field (may be some syntax issue) anyways , that is okay for now.
and i am getting data in event.get('column6')
Please see the screen shot attached in earlier post , you can see column6 has the value(ip address) and ipcpy field is copy of that field
Thank you Badger,
Yeah, my focus was on copying the field with ruby variable and multiple pipelines, so ignored that. Sorry.
but still struggling to understand what was wrong with captial letter. can you suggest any documentation please.
In Ruby, a "variable" that starts with an uppercase letter is a constant, not a variable. That said, ruby contstants are mutable, so I would not expect it to actually matter.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.