I'm going to start work on an addition to the mutate filter right now to "de-dot" fields, but those fields will have to be named.
A de-dot, "shotgun-approach" filter will come afterwards. This will iterate through all fields in the event to catch and change fields. This one will likely be a very expensive operation as it iterates through all, but I expect there will be some who don't know all of the fields which might have dots. This solution will be for them.
Hi Aaron,
We have also the problem with dots in our fields, and unfortunately we can not change the source.
Do you still think about a shotgun approach (maybe in the mutate filter)? This would be very helpful to us.
I just came across this problem too as some dynamic fields are being added with a dot in the fieldname. After attempting to use the mutate filter with no luck I ended up using the ruby filter. I'll paste it below in case it's of use to others.
Hi,
Thank you very much, it works.
Just a little issue with more then one dot in a field -> the ruby code replace just the first dot in a fieldname.
But that is not really a problem, because i insert the ruby filter twice.
Thanks! However, it does look like this filter might have a performance impact (doing what it does).
I reckon we will be reindexing (with field name changes) after all. Unfortunately this also affects another cluster which writes about 10 GiB of data every day. ouch. (might keep a "vintage" mode cluster / parallel software for that, though).
However, the question still remains: why this sort of breaking change when it might have been enough to state that mappings as posted in this gist https://gist.github.com/jpountz/8c66817e00a322b81f85 cannot be mixed?
Would it not have been better to try and fix the underlying cause? (I cannot judge the feasibility of that though!)
Our use case for where "dots" may appear in a field name is after the kv {} filter runs. We don't always know the field names that log sources are sending us. The Ruby code works for us, but an "official" solution would be nice go have.
@bblank There has been some internal discussion on how to better handle dotted fields (in the Elasticsearch team itself, not the Logstash team), but the dust has not yet settled. For the foreseeable future, the official solution is to use the aforementioned de_dot filter.
This ruby solution works "most" of the time for us, but I just found some fields which have "[ ]" in the name and the "dots" are not being replaced. Any ruby coders out there willing to help? e.g. field name = ad.key[12]="some text value"
Do you need the brackets? I would think those would be undesirable. Look into the mutate filter's gsub option. It will allow you to strip square braces.
@bblank you're correct. I wasn't looking too close when I wrote that.
This is a really tricky situation as that's likely to be an undesirable field name anyway. If you know what it is, you can do a remove_fieldafter copying the contents to a new field. If you don't know what the field name is, that makes things much harder.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.