De_dot not working with certain inputs


(Stewart) #1

Hello,

I'm trying to input json from azure via beats, but due to some of the field names containing schema addresses it's erroring due to them containing '.'. I can't exclude the lines as azlog is producing the logs as single lines.

I've tried using de_dot to rename those fields but it doesn't seem to be applying. I've created it as a filter with no clauses and added a tag that I've verified on other log entries, it's just these beats inputted ones that aren't working, and the only ones I actually need it for.

beats config:

input {
  beats {
port => 5044
ssl => true
ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
codec => json
  }
}

de_dot filter config:

filter {
    de_dot {
    add_tag => [ "dedotted" ]
    }
}

logstash error log (dedotted is being applied to tags but the periods aren't being stripped?)

"beat"=>{"hostname"=>"XXXXXXXX", "version"=>"5.2.2", "name"=>"XXXXXXXX"}, "tags"=>["azure_rm", "beats_input_codec_json_applied", "dedotted"], "host"=>"XXXXXXXXX"}, @metadata_accessors=#<LogStash::Util::Accessors:0x5a70edee @store={"type"=>"log", "beat"=>"filebeat"}, @lut={}>, @cancelled=false>], :response=>{"create"=>{"_index"=>"log-2017.06.01", "_type"=>"log", "_id"=>"AVxjXUjs2Kk_euCfvItU", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"Field name [http://schemas.microsoft.com/claims/authnclassreference] cannot contain '.'"}}}, :level=>:warn}

versions

logstash.noarch 1:2.2.4-1 @logstash-2.2
elasticsearch.noarch 2.4.4-1 @elasticsearch-2.x

edit: I've tried removing the field with mutate as both the name it's erroring as and with the nested prefix it'd appear with in elasticsearch (claims.) but it's still not removing it, and erroring that the field name is invalid


(Magnus Bäck) #2

The field name http://schemas.microsoft.com/claims/authnclassreference is totally unreasonable.

I can't exclude the lines as azlog is producing the logs as single lines.

This part I don't understand. How come "http://schemas.microsoft.com/claims/authnclassreference" ends up as a field name?


(Stewart) #3

The inbound logs are single-lined versions of the JSON output on https://msdn.microsoft.com/en-us/library/azure/dn931934.aspx, the URLs are part of the claims section.

I was originally importing and using grok/regex to split the fields but found that using single lines, setting the codec to JSON and doing a split would automatically separate the fields using name, value which is more accurate, in case certain fields aren't in current logs I've seen to write regex for.

Because the URLs are part of the name payload when split, the error is occurring. I don't even need the field but because I can't exclude when sending and still benefit from the automatic splitting, and I seemingly can't de_dot or mutate it out I'm not sure what else to do.


(Magnus Bäck) #4

I'm not entirely following along, but one option could be to use a ruby filter for renaming or deleting the fields.


(Stewart) #6

edit: I've got it working by removing the split and starting the field names at [claims] instead of [Records]. Thanks for your help.

the trouble I seem to be having is with the field name.

The Log will arrive like below (linted to make it easier to read, but will be in a single line)

{
  "Records": [
    {
      "authorization": {
         ...
      },
      "claims": {
         ...
        "http://schemas.microsoft.com/identity/claims/scope": "user_impersonation",
      }
    }
  ]
}

I'm attempting to rename the field with

filter {
  if "azure_rm" in [tags] {
    # split array into own fields
    split { field => "Records" }
    mutate { rename => [ "[Records][claims][http://schemas.microsoft.com/identity/claims/scope]", "Scope" ] }
  }
}

Do I need to escape part of the field name? it's still throwing the same error whereas renaming in the same way works elsewhere, just with non-URI field names. I imagine I'll have the same problems with the methods you suggested if I can't refer to the field by its name


(Magnus Bäck) #7

The problem is that Records is an array. If it only has a single element (or a fixed and small number of them) you should be able to reference your URL field via [Records][0][claims][http://schemas.microsoft.com/identity/claims/scope]. If you want to bulk-rename field for any number of elements in Records you will have to use a ruby filter.


(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.