Stripping text from a JSON file

Hi all!

I have a valid JSON structure that is prepended with a fixed string and then a number. Something like:

RRR This is the leading string created at 01-01-2017 at 12:34:56:
{ 
     valid JSON 
}

So there is a fixes string "RRR This is the ...." and a variable part, the date and time. Ah, and BTW, the string is always the same length.
How do i use FileBeats or LogStash to strip off that leading string so i only get valid JSON in ES?

This is what i tried so far:

   input {
    beats {
        port => "5043"
    }
}

filter {
    mutate {
        gsub => ["message", "^(.*){", ""]
    }

    json {
        source => "data"
    }

    date {
        match => [ "receivedTime", "UNIX" ]
        target => "@timestamp"
    }
}

output {
  elasticsearch {
  hosts => [ "localhost:9200" ]
  index => filebeat
  }
}

and i keep getting the following message:

2017/03/21 18:26:01.644365 json.go:34: ERR Error decoding JSON: invalid character 'R' looking for beginning of value
2017/03/21 18:26:01.644428 json.go:34: ERR Error decoding JSON: json: cannot unmarshal number into Go value of type map[string]interface {}

Thanks,

Ton.

Do not configure Filebeat to treat the input as JSON, because it isn't JSON. Do make sure to configure the multiline feature so the lines are joined into a single logical message.

I don't see how the source option for your json filter could make sense. Where does the data field come from?

Ok, removed the JSON from FileBeat, "data" is the top field in my JSON document. If all my log entries are on the same line, do i need the multiline?

If all my log entries are on the same line, do i need the multiline?

No.

Remember i'm still a newbie. After fighting a lot with regex i just solved it like this:

filter {
    grok {
        patterns_dir => ["./patterns"]
        match => { "message" => "%{DATA:todelete}: %{GREEDYDATA:message}" }
        remove_field => "todelete"
        overwrite => ["message"]
    }
   json {
       source => "message"
   }
   date {
       match => [ "receivedTime", "UNIX_MS" ]
   }
}

This works as i want it to work. Now i face the "Objects in arrays are not well supported" issue, but i will open another topic for this.

Replace %{DATA:todelete} with %{DATA} and you won't need remove_field. You can also use a mutate filter's gsub option to trim the message field in-place.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.