Logstash CSV Filter: Define Columns from Event Data/Fields

This is related to: CSV Filter Column

Use case:

The S3 input plugin can return AWS CloudFront logs from an S3 bucket. The subsequent events include the message (a tab delimited log line) and a field called "cloudfront_fields" that identifies each field in the log.

It is straightforward to mutate the "cloudfront_fields" into an array of column headers:

{
  "type" => "cloudfront",  
  "cloudfront_version" => "1.0",
  "@version" => "1",
  "@timestamp" => 2019-01-04T22:08:40.319Z,
  "message" => "tab\tdelimited\tlog\entry\t...",
  "cloudfront_fields" => [
     [ 0] "date",
     [ 1] "time",
     [ 2] "x-edge-location",
     [ 3] "sc-bytes",
     [ 4] "c-ip",
     [ 5] "cs-method",
     [ 6] "cs(Host)",
     [ 7] "cs-uri-stem",
     [ 8] "sc-status",
     [ 9] "cs(Referer)",
     [10] "cs(User-Agent)",
     [11] "cs-uri-query",
     [12] "cs(Cookie)",
     [13] "x-edge-result-type",
     [14] "x-edge-request-id",
     [15] "x-host-header",
     [16] "cs-protocol",
     [17] "cs-bytes",
     [18] "time-taken",
     [19] "x-forwarded-for",
     [20] "ssl-protocol",
     [21] "ssl-cipher",
     [22] "x-edge-response-result-type",
     [23] "cs-protocol-version",
     [24] "fle-status",
     [25] "fle-encrypted-fields"
  ]
}

I cannot figure out how to use the [cloudfront_fields] array as the "columns" input for the CSV filter:

input {
  s3 {
    "type" => "cloudfront"
    "bucket" => "${S3_BUCKET}"
    "prefix" => "${S3_BUCKET_PREFIX}"
    "region" => "${S3_BUCKET_REGION}"
    "additional_settings" => {
          "force_path_style" => true
          "follow_redirects" => false
     }
  }
}

filter {
  mutate {
    split => { "cloudfront_fields" => " " }
  }
  csv {
    separator => "\t"
    columns => [cloudfront_fields] # <-- everything I try here doesn't work
    target => "csv"
  }
}

output {
  stdout { codec => "rubydebug" }
}

I've read through Accessing Event Data and Fields in the Configuration and it does not appear to address this use case. (or perhaps this use case deviates from those instructions)

Yes, I could hardcode a grok pattern. But it seems more robust to use the provided event data to map the fields. This will also more-gracefully handle the addition/removal of fields if that should ever occur.

I am also open to suggestions of other filter plugin combinations that would accomplish the same thing and would still be more succinct than writing a ruby script.

Thanks.

The CSV plugin assumes the header column is a single string. If you can convert the array into a single string then it should work.

Are you referring to the CSV filter plugin? (not the input plugin)

The filter plugin requires an array: https://www.elastic.co/guide/en/logstash/current/plugins-filters-csv.html#plugins-filters-csv-columns

Am I misunderstanding the doc?

(I ended up just using Ruby for this, but still interested in an answer)

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.