Inputting JSON object to logstash - unable to remove certain fields or parse as JSON

For some context, I am completeley new to logstash and am interested in filtering this data in three ways:

  1. remove ceratin fields such as "anonymousId" or "library" (which is nested within context)
  2. extract fields from a nest (e.g. move "plan" outside of its nest of "properties"
  3. rename fields
  4. attempt to make the object as 'flat' (un-nested) as possible.

Below is the RAW json that is being forwarded to logstash, via an HTTP POST, and I will go through what I have tried.

{
    "anonymousId":null,
    "channel":"server",
    "context":{
        "ip":"208.54.83.183",
        "library":{
            "name":"analytics-ruby",
            "version":"2.0.12"},
        "userAgent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/43.0.2357.130 Chrome/43.0.2357.130 Safari/537.36"},
    "event":"Logged In",
    "integrations":{},
    "messageId":"a4c76305-1b02-46c9-8399-76a58fc8edb9",
    "originalTimestamp":"2015-07-27T11:19:27.797+02:00",
    "projectId":"Y0xNBc7l2I",
    "properties":{"plan":"Admin"},
    "receivedAt":"2015-07-27T09:19:29.374Z",
    "sentAt":"2015-07-27T09:19:27.809Z",
    "timestamp":"2015-07-27T09:19:29.362Z",
    "type":"track",
    "userId":"1",
    "version":2,
    "writeKey":"keqIuqD3O8iL1M5"
} 
  • Using grok's remove_field was partially successful but I was unable to remove "anoymousId" (my instinct tells me that this was due to it having a null value)
  • I was able to remove "channel", "integrations" - but unable to remove "context" nor "library" (I attempted to use object-dot-notation)
  • As odd as it may sound - all of the above was done without the "json" codec
  • Now I am at the point of trying to achieve my goals with the "json" codec and using the json filter plugin, but to no avail - I am not able to remove, add or alter any fields currently, below is my config file
input {
  http { port => 8090
            codec => "json"
  }
}
filter {
  grok { 
         match => [ "message", "%{GREEDYDATA}"]
  }
  json {
          source => "message"
          remove_field => "channel"
  }
}
output { 
  stdout { codec => rubydebug }
  }

If anyone could point out my oversite or redirect my efforts, it would be greatly appreciate it.
Edit: I apologize for the RAW JSON not being very readable, I'm working on getting the post to display the json with proper formatting.

  • Use the json codec or the json filter, not both.
  • Your current grok filter serves no purpose.
  • To reference nested fields, use the [field][subfield] notation. See the documentation.

Thank you for the quick response, I am unfortunately still getting parse error with this config:

input {
  http { port => 8090
            codec => "json"
  }
}
filter {
  grok { 
         remove_field => ["channel", "userId", "anonymousId"]
  }
}
output { 
  stdout { codec => rubydebug }
  }

perhaps I am not understanding logstash's default behaviors, here is the output:

{
          "anonymousId" => nil,
              "channel" => "server",
              "context" => {
               "ip" => "208.54.83.183",
          "library" => {
               "name" => "analytics-ruby",
            "version" => "2.0.12"
        },
        "userAgent" => "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/43.0.2357.130 Chrome/43.0.2357.130 Safari/537.36"
    },
                "event" => "Logged In",
         "integrations" => {},
            "messageId" => "a4c76305-1b02-46c9-8399-76a58fc8edb9",
    "originalTimestamp" => "2015-07-27T11:19:27.797+02:00",
            "projectId" => "Y0xNBc7l2I",
           "properties" => {
        "plan" => "Admin"
    },
           "receivedAt" => "2015-07-27T09:19:29.374Z",
               "sentAt" => "2015-07-27T09:19:27.809Z",
            "timestamp" => "2015-07-27T09:19:29.362Z",
                 "type" => "track",
               "userId" => "1",
              "version" => 2,
             "writeKey" => "keqI2D3O8iLGdbt",
             "@version" => "1",
           "@timestamp" => "2015-07-28T09:27:24.909Z",
              "headers" => {
                "content_type" => "application/json",
              "request_method" => "POST",
                "request_path" => "/hooks/created_callback",
                 "request_uri" => "/hooks/created_callback",
                "http_version" => "HTTP/1.1",
                    "http_csp" => "active",
          "http_cache_control" => "no-cache",
                 "http_origin" => "chrome-extension://fhbjgbiflinjbdggehcddcbncdddomop",
             "http_user_agent" => "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/43.0.2357.130 Chrome/43.0.2357.130 Safari/537.36",
          "http_postman_token" => "3d7015ed-4abd-d13e-f133-cf7ac4c73096",
                 "http_accept" => "*/*",
        "http_accept_encoding" => "gzip, deflate",
        "http_accept_language" => "en-US,en;q=0.8",
             "http_connection" => "close",
                   "http_host" => "localhost:8090",
              "content_length" => "674"
    },
                 "tags" => [
        [0] "_grokparsefailure"
    ]
}

Thanks again

This looks perfectly fine. What parse error are you talking about?

Use the mutate filter to remove the fields, not the grok filter. When using remove_field with a non-mutate filter it's only done when the filter completes successfully, and if you're not providing any grok expression it won't count as successful (because it won't do anything).

Thank you for this clarification! I am well on my way now