Field name cannot contain '.'


(Aaron Mildenstein) #9

I'm going to start work on an addition to the mutate filter right now to "de-dot" fields, but those fields will have to be named.

A de-dot, "shotgun-approach" filter will come afterwards. This will iterate through all fields in the event to catch and change fields. This one will likely be a very expensive operation as it iterates through all, but I expect there will be some who don't know all of the fields which might have dots. This solution will be for them.


(Aaron Mildenstein) #10

@radu.stefanache We just published version 3.0.0 of logstash-filter-elapsed. This is a breaking change.

  • All dots in field names and tags have been replaced by _ (an underscore)
  • This means you may need to change some conditionals in your Logstash configuration
  • Some of your Kibana dashboards and/or queries may require change
  • Any other outputs used will have to be adapted to use the new field names and tags.

You can upgrade to this version by doing:

bin/plugin update logstash-filter-elapsed

from your Logstash directory.


(juergen) #11

Hi Aaron,
We have also the problem with dots in our fields, and unfortunately we can not change the source.
Do you still think about a shotgun approach (maybe in the mutate filter)? This would be very helpful to us.

many thanks


(Gary Hodgson) #12

I just came across this problem too as some dynamic fields are being added with a dot in the fieldname. After attempting to use the mutate filter with no luck I ended up using the ruby filter. I'll paste it below in case it's of use to others.

filter {
  ruby {
        code => "
          event.to_hash.keys.each { |k| event[ k.sub('.','_') ] = event.remove(k) if k.include?'.' }
        "
    }
}

(juergen) #13

Hi,
Thank you very much, it works.
Just a little issue with more then one dot in a field -> the ruby code replace just the first dot in a fieldname.
But that is not really a problem, because i insert the ruby filter twice.

many thanks
juergen


Best Way to create nested field
(Gary Hodgson) #14

Hi,

Ah yes, sorry I don't really know ruby that well. The following will replace all dots:

ruby {
        code => "
          event.to_hash.keys.each { |k| event[ k.gsub('.','_') ] = event.remove(k) if k.include?'.' }
        "
    }

Best regards,
Gary


Removing '.' (DOT) character from field name using ES-Hadoop SerDe
(Chris) #15

Hi,

I just came across the same problem

"type":"mapper_parsing_exception","reason":"Field name [data.0.count] cannot contain '.'"}

However, this is a HUGE show-stopper for us, since we are basically "flattening" json data before inserting them into Elasticsearch.

e.g.:
{
  "foo": {
    "bar": "something" 
  }
}

becomes

{ "foo.bar": "something" }

This is VALID Json. You are essentially breaking VALID JSON input.

I am really upset about this, what the heck is the reasoning behind this?
ok, I just found the responsible commit for this:

So gathering from this information: this change won't affect existing indices / field names?


(Aaron Mildenstein) #16

The new de_dot filter can turn dotted fields into nested fields.

I'm not sure how it affects existing field names, but you would not be able to re-index that data without changing the field names.


(Chris) #17

Thanks! However, it does look like this filter might have a performance impact (doing what it does).

I reckon we will be reindexing (with field name changes) after all. Unfortunately this also affects another cluster which writes about 10 GiB of data every day. ouch. :wink: (might keep a "vintage" mode cluster / parallel software for that, though).

However, the question still remains: why this sort of breaking change when it might have been enough to state that mappings as posted in this gist https://gist.github.com/jpountz/8c66817e00a322b81f85 cannot be mixed?

Would it not have been better to try and fix the underlying cause? :smile: (I cannot judge the feasibility of that though!)


(Aaron Mildenstein) #18

@Christopher_Blasnik, because the ability was removed in Elasticsearch 2.0.

See the breaking changes section of the Elasticsearch documentation, under the header, Field names may not contain dots.


#20

I'm assuming this impacts all the geoip items?


(Aaron Mildenstein) #21

Kibana 3.x represents object structures (sub-fields) with dotted notation, but they are still object structures within Elasticsearch.

The GeoIP filter (as in 1.5.x and 2.x) sends an object, not dotted fields.


#22

Ah...ok...that helps then..thank you...I was worried :slight_smile:


(Bob) #23

Our use case for where "dots" may appear in a field name is after the kv {} filter runs. We don't always know the field names that log sources are sending us. The Ruby code works for us, but an "official" solution would be nice go have.


(Aaron Mildenstein) #24

@bblank There has been some internal discussion on how to better handle dotted fields (in the Elasticsearch team itself, not the Logstash team), but the dust has not yet settled. For the foreseeable future, the official solution is to use the aforementioned de_dot filter.


(Bob) #25

This ruby solution works "most" of the time for us, but I just found some fields which have "[ ]" in the name and the "dots" are not being replaced. Any ruby coders out there willing to help? e.g. field name = ad.key[12]="some text value"


Logstash Unable to parse dot inside a json
(Aaron Mildenstein) #26

Do you need the brackets? I would think those would be undesirable. Look into the mutate filter's gsub option. It will allow you to strip square braces.


(Loren Siebert) #27

I used this ruby filter instead of de_dot because my dotted fields are nested under params and I don't know what they are in advance:


filter {
  ruby {
    code => "
      params = event['params'] && event['params'].to_hash
      params.keys.each { |k| params[ k.gsub('.','_') ] = params.delete(k) if k.include?'.' } unless params.nil?
    "
  }
}

(Bob) #28

I am new to ELK so I may be wrong, but gsub only works on the field contents, not the name of the field. Right?


(Aaron Mildenstein) #29

@bblank you're correct. I wasn't looking too close when I wrote that.

This is a really tricky situation as that's likely to be an undesirable field name anyway. If you know what it is, you can do a remove_field after copying the contents to a new field. If you don't know what the field name is, that makes things much harder.