Problem with parsing the CSV file correctly

Hello,

I have a problem with parsing the CSV file in Logstash.

The structure of my template in index looks like this:

{
	"mappings": {
		"doc": {
			"properties": {

				"ip_src": { "type": "ip" },
				"ip_dst": { "type": "ip" },
				"hops_no": { "type": "byte" },

				"v_route": {
					"type": "nested",
					"properties": {
						"vip":  { "type": "ip" },
						"vdelay": { "type": "float"  }
					}
				},

				"m_time":   { "type": "date", "format": "yyyy-MM-dd HH:mm:ss||epoch_second"  }
			}
		}
	},

	"version": 1
}

An example of one line from an input file is:

89.73.142.84;89.73.142.84;"2002-12-12 12:12:12";3;89.73.142.84@0.1#89.73.142.85@0.3#89.73.142.86@0.5

The contents of the Logstash configuration file is:

input {
	file {
		path => "/home/user/data/data_pre/*_0"
		start_position => "beginning"
		sincedb_path => "/dev/null"
		file_completed_action => "delete"
		file_completed_log_path => "/home/user/data/data_post/archive"
	}
}

filter {
	csv {
		separator => ";"
		columns => ["ip_src", "ip_dst", "m_time", "hops_no", "vhops"]
	}

	ruby {
		code => '
			pb = event.get("vhops");
			b = pb.split("#");
			ary = Array.new;
			for c in b;
				keyvar = c.split("@")[0];
				valuevar = c.split("@")[1];
				d = "{vip : " << keyvar << ", vdelay : " << valuevar << "}";
				ary.push(d);
			end;

		event.set("v_route", ary);
		'
	}

	mutate { remove_field => [ "message", "vhops" ] }
}


output {
	elasticsearch {
		hosts => ["http://192.168.X.Y:9200"]
		index => "index00"
	}

	stdout {}
}

There is a problem with the v_route field.

In Logstash logs I have:

...
[2019-10-12T14:22:10,380][INFO ][logstash.pipeline        ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x1f45a99e run>"}
[2019-10-12T14:22:10,429][INFO ][filewatch.observingtail  ] START, creating Discoverer, Watch with file and sincedb collections
[2019-10-12T14:22:10,429][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2019-10-12T14:22:10,699][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2019-10-12T14:22:11,319][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"index00", :_type=>"doc", :routing=>nil}, #<LogStash::Event:0x606b0f93>], :response=>{"index"=>{"_index"=>"routes_ipv4", "_type"=>"doc", "_id"=>"i3pZwG0BJFV1L_wCJ0wp", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"object mapping for [v_route] tried to parse field [null] as object, but found a concrete value"}}}}

I'm building the v_route field wrong?
Where is the problem?

My Logstash & ES has version 6.8.

vroute is an array of strings, which looks wrong to me.

   "v_route" => [
    [0] "{vip : 89.73.142.84, vdelay : 0.1}",
    [1] "{vip : 89.73.142.85, vdelay : 0.3}",
    [2] "{vip : 89.73.142.86, vdelay : 0.5}"
],

What did you intend it to be?

Oh you're right.
I wanted to keep the data about this field in the nested datatype - array of objects with the values vip (ip type) and vdelay (float type).
Could you tell me how to modify this code?

My field in the template:

"v_route": {
	"type": "nested",
	"properties": {
		"vip":  { "type": "ip" },
		"vdelay": { "type": "float"  }
	}
},

If you change the loop to be

        for c in b;
            keyvar = c.split("@")[0];
            valuevar = c.split("@")[1];
            h = Hash.new
            h["vip"] = keyvar
            h["vdelay"]= valuevar
            ary << h
        end;

You will get an array of hashes

   "v_route" => [
    [0] {
           "vip" => "89.73.142.84",
        "vdelay" => "0.1"
    },
    [1] {
           "vip" => "89.73.142.85",
        "vdelay" => "0.3"
    },
    [2] {
           "vip" => "89.73.142.86",
        "vdelay" => "0.5"
    }
],

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.