Hi I'm trying to import csv and parse array data in a specific column.
My csv data is like this.
TXN_TIME|TENANT_ID|TAG_ID|TAG_VALUE
2019-06-14 10:00:00.693|T00|GTW000000007-001-1|[[1.9744,3.0427,1.9170],[1.9743,3.0061,1.9171],[1.9109,3.0061,1.8107],[1.9109,2.9267,1.9280],[1.9109,2.9267,1.8376],[1.9243,2.9353,1.8376],[1.9243,2.9670,1.8376],[1.9243,2.9670,1.8486],[1.9316,2.9670,1.8486],[1.9402,2.9670,1.8486],[1.9524,2.9670,1.8486],[1.9597,2.9133,1.8486],[1.9597,2.9853,1.8486],
....(so many arrays here!)......
[1.9756,2.9560,1.7949],[1.9719,2.9560,1.7949],[1.9719,2.9585,1.9731],[1.9609,2.9585,1.9731],[1.9243,2.9560,1.9621],[1.9365,2.9585,1.8864],[1.9365,2.9573,1.8584],[1.9670,2.9573,1.8584],[1.9670,2.9536,1.8767]]
Each array element means [x_axis, y_axis, z_axis], so I want to split those arrays by each axis.
This is my logstash conf file.
input {
file {
path => ["C:/Users/Desktop/3axis_MultiTagDataList-test.csv"]
start_position => "beginning"
sincedb_path => "nul"
}
}
filter {
csv {
separator => "|"
skip_header => true
columns => ["TXN_TIME","TENANT_ID","TAG_ID","TAG_VALUE"]
}
mutate{convert=>["TENANT_ID","string"]}
mutate{convert=>["TAG_ID","string"]}
mutate{strip => ["TAG_VALUE"]}
grok{
break_on_match => false
match=>{"TAG_VALUE"=>"\[(?<x_axis>\d*.\d*),(?<y_axis>\d*.\d*),(?<z_axis>\d*.\d*)\]"}
}
mutate {
remove_field => ["message"]
}
}
output {
elasticsearch {
hosts => "http://localhost:9200"
index => "my_multiaxis_test_06"
}
stdout {
codec => rubydebug
}
}
What I expect first was that, I could use grok match regex to directly extract x/y/z axis data from each array element, and I thought regex pattern continues to check every element array that matches [x_axis, y_axis, z_axis] in "TAG_VALUE" field by using 'break_on_match' set to false.
But, I saw only the first element ([1.9744,3.0427,1.9170]) is parsed into x_axis, y_axis, z_axis fields.
What's wrong with my script and how can I make grok match regex check all the arrays in array and finally collect the values in each axis field?
I want the result like this,
"@timestamp" => 2019-10-23T05:52:43.495Z
"x_axis" =>
[0] => 1.9744
[1] => 1.9109
[2] => 1.9109
...
[255] => 1.9670
"y_axis" =>
[0] => 3.0061
[1] => 3.0061
[2] => 2.9267
...
[255] =>2.9536
"z_axis" =>
[0] => 1.9171
[1] => 1.8107
[2] => 2.9267
...
[255] =>1.8767
x_axis, y_axis, z_axis fields should be double array type.
Help me, I kinda desperate to figure out.