Varnish client and backend in the same log file


For some time now I'm trying to create a logstash config that will place information from 2 lines in the same document. As you may probably know, Varnish can be configured to log into varnishncsa for both client and backend mode, therefore some values will have other meanings. For example the field time taken in client mode will be from the time of the client initiated the request until the last byte is send, while in backend mode is the time from between varnish sent the request until received the last byte.
What I'm trying to get there is a nice index that contains information for both backend and client log lines.

document_id: 8888
varnish.time.taken.client: 443
varnish.time.taken.server: 41280

Varnish log looks like this: - - [27/Oct/2017:15:24:00 +0300] GET "" 304 0 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.62 Safari/537.36" 443 hit c 8888 76 - - [27/Oct/2017:15:24:00 +0300] GET "" 200 0 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.62 Safari/537.36" 41280 - b 80 8888

I was able to parse the log lines and identify a unique identifier. In client mode the first number after the "C" (representing client mode), while in backend mode the last number which is transaction reference.
Based on the output fields of grok, the code is able to match the document_id number in both cases with an if statement like the code bellow:

if (['varnish.side] == "c") {
mutate {
add_field => {
"[varnish.time.taken.client]" => "%{[varnish.time.taken]}"
"[document_id]" => "%{[]}" }

...same code for "b" and also converting the new fields to integers.

Basically at this point, the only thing to do is to create the aggregation and voila, everything will work... but is not.

aggregate {
task_id => "%{document_id}"
code => "
map['v.t.t.c'] ||= 0
map['v.t.t.c'] += event.get('varnish.time.taken.client')
map['v.t.t.s'] ||= 0
map['v.t.t.s'] += event.get('varnish.time.taken.server')
push_map_as_event_on_timeout => true
timeout_task_id_field => "document_id"
timeout => 10

It works, but for some reason the data kept in the new fields from aggregate, will only work for one line of code, while the other line will stay 0.

The output to elasticsearch is this:

elasticsearch {
hosts => [ "host-els-1:9200", "host-els-2:9200"]
index => "testing-%{+YYYY.MM.dd}"
document_id => "%{document_id}"
doc_as_upsert => true

What am I missing here? How can I aggregate both lines into a single one?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.