Logstash URL parameter as tags

Hi all,

I have this working configuration that receives logs from heroku and process then using grok and add a couple of tags from enviroments vars.

input {  
  http {    
    port => "${PORT}"    
    tags => ["${TAG}", "${ENV}"]  
  } 
}
filter {  
  if [headers][http_user_agent] =~/ELB-HealthChecker/ { drop {} }  
  grok {      
    match => { 
      "message" => [ 
                     "%{SYSLOG5424PRI:pri}%{NUMBER:rfc_version} %{TIMESTAMP_ISO8601:timestamp} d.%{UUID:drain_id} %{WORD:app} %{USERNAME:dyno} - - %{GREEDYDATA:message}",
                     "%{SYSLOG5424PRI:pri}%{NUMBER:rfc_version} %{TIMESTAMP_ISO8601:timestamp} %{WORD:host} %{WORD:app} %{USERNAME:dyno} - %{GREEDYDATA:message}"         
                   ]       
    }      
    overwrite => ["message"]      
    remove_field => ["pri", "rfc_version", "timestamp", "syslog5424_pri"]    }
} 
output {  
  elasticsearch {    
    hosts => [ "xxxxx"]  
  }
 }

But I would like to change the tags from enviroments variables to string parameters instead. Something like

https://mylogstashurlhost/?env=staging&tag=myservicename

This is to avoid creating a lot of logstash instances, so with only one or two instances we can collect logs from many sources. is this possible at all?

Thanks
Regards
JM

You can use another grok filter to take apart the URL.


filter {
  grok {
    match => [ "message", "%{URIPARAM}" ]
  }

{
  "URIPARAM": [
    [
      "?env=staging&tag=myservicename"
    ]
  ]
}

And then use the KVP filter to take that string apart.

kv {
    source => "URIPARAM"
    field_split => "&"
  }

Aquax,

Thanks for the reply. However the url I'm refering is not in the message it is the url where the my logstash instance is hosted. It is the request that logstash is receiving.

Thanks
JM

I do not think this is possible with the code as is. It is passed the full http request object, it validates and deletes the authorization header, then it builds the event from the req.headers and req.content. It does nothing with req.uri.

I think it would be a really useful enhancement to be able to access that.

Badger,

Thanks for you reply. Sorry I'm not that familiar with Logstash code. But I was debuging it a bit and now I have a clear understaning of what I want :stuck_out_tongue:

I want to access request_path and be able to parse it, I think with KV as Aquax mentioned.

logstash_1  | {
logstash_1  |        "headers" => {
logstash_1  |           "cache_control" => "no-cache",
logstash_1  |          "content_length" => "19",
logstash_1  |         "http_user_agent" => "PostmanRuntime/7.26.8",
logstash_1  |         "accept_encoding" => "gzip, deflate, br",
logstash_1  |          "request_method" => "POST",
logstash_1  |               "http_host" => "localhost:1514",
logstash_1  |              "connection" => "keep-alive",
logstash_1  |            "request_path" => "/?enviroment=staging&service=rollio-core",
logstash_1  |            "content_type" => "application/json",
logstash_1  |            "http_version" => "HTTP/1.1",
logstash_1  |             "http_accept" => "*/*",
logstash_1  |           "postman_token" => "e000afa8-a9c4-4ce1-b313-c77421be92aa"
logstash_1  |     },
logstash_1  |           "host" => "172.19.0.1",
logstash_1  |           "tags" => [
logstash_1  |         [0] "rollio-nlp-idl",
logstash_1  |         [1] "staging"
logstash_1  |     ],
logstash_1  |        "message" => "stuff stuff",
logstash_1  |       "@version" => "1",
logstash_1  |     "@timestamp" => 2021-07-22T17:31:55.733Z
logstash_1  | }

The formatHeader function is adding the uri to the headerobject.

You are right, I missed that.

Yeah if you put the [headers][request_path] field into the kv filter it should help get you what you want.

Aquax,

Thanks for the reply just tested it and it worked great!

logstash_1  | {
logstash_1  |           "host" => "172.19.0.1",
logstash_1  |        "message" => "as",
logstash_1  |        "headers" => {
logstash_1  |          "request_method" => "POST",
logstash_1  |               "http_host" => "localhost:1514",
logstash_1  |            "http_version" => "HTTP/1.1",
logstash_1  |           "cache_control" => "no-cache",
logstash_1  |              "connection" => "keep-alive",
logstash_1  |         "http_user_agent" => "PostmanRuntime/7.26.8",
logstash_1  |            "request_path" => "/?enviroment=staging&service=rollio-core",
logstash_1  |           "postman_token" => "d7dc5723-7f78-4519-9c8d-9d121b5a0125",
logstash_1  |             "http_accept" => "*/*",
logstash_1  |          "content_length" => "19",
logstash_1  |            "content_type" => "application/json",
logstash_1  |         "accept_encoding" => "gzip, deflate, br"
logstash_1  |     },
logstash_1  |        "service" => "rollio-core",
logstash_1  |     "@timestamp" => 2021-07-23T09:15:58.811Z,
logstash_1  |     "enviroment" => "staging",
logstash_1  |       "@version" => "1"
logstash_1  | }

My complete working configuration, so others can benefit from this :slight_smile:

input {  
  http {    
    port => "${PORT}"
  } 
}
filter {  
  if [headers][http_user_agent] =~/ELB-HealthChecker/ { drop {} }  
  grok {      
    match => { 
      "message" => [ 
                     "%{SYSLOG5424PRI:pri}%{NUMBER:rfc_version} %{TIMESTAMP_ISO8601:timestamp} d.%{UUID:drain_id} %{WORD:app} %{USERNAME:dyno} - - %{GREEDYDATA:message}",
                     "%{SYSLOG5424PRI:pri}%{NUMBER:rfc_version} %{TIMESTAMP_ISO8601:timestamp} %{WORD:host} %{WORD:app} %{USERNAME:dyno} - %{GREEDYDATA:message}"         
                   ]       
    }      
    overwrite => ["message"]      
    remove_field => ["pri", "rfc_version", "timestamp", "syslog5424_pri"]    
  }
}
filter { 
  kv { 
    source => "[headers][request_path]"    
    field_split => "&" 
    trim_key => "/?" 
  }
}
output {  
  elasticsearch {    
    hosts => [ "xxxxx"]  
  }
 }

Thanks both for the help!

Regards
Juan Manuel

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.