Combine Log Lines That Share A Unique ID

cheeseandpepper · July 29, 2015, 3:17pm

I've appended a unique id to the very end of every log line that tags a log message as being part of a single http request. Generally an http request has 5 log lines in my application and if there are multiple users, the lines are not necessarily in order. Here's a contrived example

1 INFO 123.123.123.123 GET /search? params='foo' unique_id=12345ABCDE
2 INFO 111.111.111.111 POST /submit? params='bar' unique_id=ZZZZ8888
3 INFO 123.123.123.123 Completed 200 34ms unique_id=12345ABCDE

My goal is to combine lines 1 and 3 into the same elasticsearch document by matching on the unique_id. Is there a good solution for this with logstash? Thanks!

warkolm · July 29, 2015, 11:30pm

Looks like https://www.elastic.co/guide/en/logstash/current/plugins-filters-aggregate.html will do it.

cheeseandpepper · July 30, 2015, 5:33pm

I'm starting to use the logstash-filter-aggregate plugin and I'm having mixed results. Here's my config:

input {
  file {
    path => [
      "/Users/mbp/Desktop/dev.log"
    ]
    start_position => "beginning"
  }
}

filter {

  grok {
    match => { "message" => "(?<trace_id>(?![trace_id=])[\S][^\s]*$)"}
  }

  grok {
    match => { "message" => "(?<response_time>OK in (.*)ms)"}
  }

  grok {
    match => { "message" => "(?<status_code>Completed (\d+))"}
  }

  if [message] =~ /search.json/ {
    aggregate {
      task_id => "%{trace_id}"
      code => "event['search_request'] = event['message']; event['query_type'] = 'search'"
      map_action => "create"
    }
   }

      if [message] =~ /Headers: {/ {
        aggregate {
          task_id => "%{trace_id}"
          code => "event['headers'] = event['message']"
          map_action => "update"
        }
     }

  if [message] =~ /INFO Completed/ {
    aggregate {
      task_id => "%{trace_id}"
      code => "event['status_code'] = %{status_code}; event['response_time'] = %{response_time}"
      map_action => "update"
      end_of_task => true
      timeout => 120
    }
  }

}


output {

  stdout { codec => rubydebug }

  elasticsearch_http {
    host => ["localhost"]
    port => "9200"
    index => "logstash-%{+YYYY.MM.dd}"
  }

}

There are 2 main problems, first, I'm not seeing the aggregated records in elasticsearch. That is, there's no single record that contains the contents of the 3 aggregations.

The 2nd, is that I'm getting a ton of extra documents being stored with messages that are and empty string. I suspect that groking 3x on message is messing things up.

I would appreciate any advice. Thanks! [edit: apologizes for code formatting, it's hard to do in a browser]

warkolm · July 30, 2015, 9:37pm

When you specify message multiple times like that it'll try to match all 3.
What' you're supposed to do when groking for that is match the entire line, not just part of it.

Topic		Replies	Views
Merging two lines with the same id to use as one document Logstash	4	1521	April 22, 2019
Merge information from two different lines into single event Logstash	3	2522	April 6, 2017
Combining two events in one Logstash	9	9515	July 6, 2017
Combine fields in Elasticsearch from logs with same ID Logstash	7	1666	August 25, 2020
Combine message parts from Logfiles that share the same id and write them into new document Logstash	1	404	June 23, 2017

Combine Log Lines That Share A Unique ID

Related topics