De_dot not working with certain inputs

1stewart · June 1, 2017, 11:29am

Hello,

I'm trying to input json from azure via beats, but due to some of the field names containing schema addresses it's erroring due to them containing '.'. I can't exclude the lines as azlog is producing the logs as single lines.

I've tried using de_dot to rename those fields but it doesn't seem to be applying. I've created it as a filter with no clauses and added a tag that I've verified on other log entries, it's just these beats inputted ones that aren't working, and the only ones I actually need it for.

beats config:

input {
  beats {
port => 5044
ssl => true
ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
codec => json
  }
}

de_dot filter config:

filter {
    de_dot {
    add_tag => [ "dedotted" ]
    }
}

logstash error log (dedotted is being applied to tags but the periods aren't being stripped?)

"beat"=>{"hostname"=>"XXXXXXXX", "version"=>"5.2.2", "name"=>"XXXXXXXX"}, "tags"=>["azure_rm", "beats_input_codec_json_applied", "dedotted"], "host"=>"XXXXXXXXX"}, @metadata_accessors=#<LogStash::Util::Accessors:0x5a70edee @store={"type"=>"log", "beat"=>"filebeat"}, @lut={}>, @cancelled=false>], :response=>{"create"=>{"_index"=>"log-2017.06.01", "_type"=>"log", "_id"=>"AVxjXUjs2Kk_euCfvItU", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"Field name [http://schemas.microsoft.com/claims/authnclassreference] cannot contain '.'"}}}, :level=>:warn}

versions

logstash.noarch 1:2.2.4-1 @logstash-2.2
elasticsearch.noarch 2.4.4-1 @elasticsearch-2.x

edit: I've tried removing the field with mutate as both the name it's erroring as and with the nested prefix it'd appear with in elasticsearch (claims.) but it's still not removing it, and erroring that the field name is invalid

magnusbaeck · June 1, 2017, 2:28pm

The field name http://schemas.microsoft.com/claims/authnclassreference is totally unreasonable.

I can't exclude the lines as azlog is producing the logs as single lines.

This part I don't understand. How come "http://schemas.microsoft.com/claims/authnclassreference" ends up as a field name?

1stewart · June 1, 2017, 2:52pm

The inbound logs are single-lined versions of the JSON output on https://msdn.microsoft.com/en-us/library/azure/dn931934.aspx, the URLs are part of the claims section.

I was originally importing and using grok/regex to split the fields but found that using single lines, setting the codec to JSON and doing a split would automatically separate the fields using name, value which is more accurate, in case certain fields aren't in current logs I've seen to write regex for.

Because the URLs are part of the name payload when split, the error is occurring. I don't even need the field but because I can't exclude when sending and still benefit from the automatic splitting, and I seemingly can't de_dot or mutate it out I'm not sure what else to do.

magnusbaeck · June 1, 2017, 8:19pm

I'm not entirely following along, but one option could be to use a ruby filter for renaming or deleting the fields.

1stewart · June 2, 2017, 5:32pm

edit: I've got it working by removing the split and starting the field names at [claims] instead of [Records]. Thanks for your help.

the trouble I seem to be having is with the field name.

The Log will arrive like below (linted to make it easier to read, but will be in a single line)

{
  "Records": [
    {
      "authorization": {
         ...
      },
      "claims": {
         ...
        "http://schemas.microsoft.com/identity/claims/scope": "user_impersonation",
      }
    }
  ]
}

I'm attempting to rename the field with

filter {
  if "azure_rm" in [tags] {
    # split array into own fields
    split { field => "Records" }
    mutate { rename => [ "[Records][claims][http://schemas.microsoft.com/identity/claims/scope]", "Scope" ] }
  }
}

Do I need to escape part of the field name? it's still throwing the same error whereas renaming in the same way works elsewhere, just with non-URI field names. I imagine I'll have the same problems with the methods you suggested if I can't refer to the field by its name

magnusbaeck · June 3, 2017, 2:56pm

The problem is that Records is an array. If it only has a single element (or a fixed and small number of them) you should be able to reference your URL field via [Records][0][claims][http://schemas.microsoft.com/identity/claims/scope]. If you want to bulk-rename field for any number of elements in Records you will have to use a ruby filter.

system · July 1, 2017, 2:57pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat can't work with a dot in field name Beats filebeat	3	1543	June 3, 2017
Dot in field name? Scripting issue Elasticsearch	2	747	August 18, 2019
Field name [server.example.com] cannot contain '.' Logstash	3	799	July 6, 2017
Filebeat and fields with dots in name Beats filebeat	6	1278	October 16, 2019
About Beats Input Logstash	6	255	May 18, 2023

De_dot not working with certain inputs

Related topics