Changing columns based on first character


I'm trying to pass a csv file which doesn't have column names. Each row in the file is also different (different variables, as well as varying row lengths), and would need more personalized column names.
What I want to do is alternate between column names based on the first character of the row (it'll be a one digit number which indicates which column set is needed) My method so far is to create an initial columns field to retrieve the first character, then check its value and then finalise the columns based on that value.

Right now my filter looks like this:

filter {
     csv {
          separator => ","
          columns => [c1, c2, c3, c4, c5]
          source => "message"

     convert => {
          "c1" => "integer" 

     if [c1] == 1 {
          columns => [d1, d2, d3, d4]

     else if [c1] == 2 {
          columns => [e1, e2, e3,]

     else if [c1] == 3 {
          columns => [f1, f2, f3, f4]


This method isn't really working and seems to sort of merge the columns in places. For instance when [c1] == 2, the column names appear something like e1, e2, e3, c3 [Essentially names form the initial columns set are added to the new one].

Any ideas on how to fix this? Thanks.

(Nachiket) #2


You can use grok to parse the line and depending on a conditional you can use the csv filter that you had planned.



Hi, could you give me an example on how I would do this? I'm not sure how I would be able to have some sort of conditional when I don't even have variables defined etc.?


(Nachiket) #4

Here is an example of an old config. I can elaborate more later once I log in using a PC.

Here is an excerpt of a conf i had written once:

filter {  if "getproxy" in [tags] {

    grok {
            patterns_dir => ["/etc/logstash/patterns"]
            match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{IP:srcip} %{PROXYBASE:base} %{OUBASE:oubase} %{GREEDYDATA:msg}" }
            add_tag => [ "User_Present" ]
            remove_tag => ["getproxy"]
            tag_on_failure => [ "unparsed" ]
    grok {
            patterns_dir => ["/etc/logstash/patterns"]
            match => { "oubase" => "%{OU:ou}com\/%{GREEDYDATA:suser}" }
            tag_on_failure => [ "unparsed" ]

            add_field =>  {"user.source" => "%{suser}"}
            remove_field => ["suser"]
            remove_field => ["oubase"]

    if "unparsed" in [tags] {
            grok {
                    patterns_dir => ["/etc/logstash/patterns"]
                    match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{IP:srcip} %{PROXYBASE:base} %{GREEDYDATA:msg}" }
                    add_tag => ["User_Absent"]
                    remove_tag => ["unparsed", "getproxy"]
                    tag_on_failure => ["unparsed"]

    if "unparsed" not in [tags] {
            kv {
                    source => "msg"
                    value_split => "="
                    target => "msg"
            kv {
                    source => "base"
                    value_split => "="
                    target => "base"

            mutate {
                            add_field =>  {"ip.device" => "%{srcip}"}
                            remove_field => ["srcip"]
                            remove_field => ["product", "product_version", "user"]


    date {
            match => [ "timestamp", "MMM dd HH:mm:ss" ]
    } } }

(Nachiket) #5

In the above example, instead of using tags as conditional the grok message captured can be used.
So basically, your grok will capture data in two parts, the first can capture the first digit of your line and the second part can be a greedydata filter.
Using this, you can write a better conditional.
Paste a sample input in case you still can't do it, I'll help you out once I get some time.

(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.