Hello.
I have a problem with a Logstash pipeline, which is getting events from Filebeat containers running on Kubernetes.
The problem is, that some of the events will have fields that have dynamically generated names on their kubernetes.labels. field (see below). I need to be able to remove these fields, without removing the all subfields of kubernetes.labels. Basically, I want to remove any field that is under kubernetes.labels and starts with jenkins/
Here's the problematic structure:
"_source": {
"kubernetes": {
"container": {
"name": "containername"
},
"namespace": "default",
"labels": {
"jenkins/example": "true",
"jenkins/problematic-label-037690a0-e36d-11e9-bdab-d8c497099129": "true"
},
"pod": {
"uid": "037690a0-e36d-11e9-bdab-d8c497099129",
"name": "pod-name"
},
"node": {
"name": "hostname"
}
}
Here's what I have tried so far:
- Translate. But this is too cumbersome to maintain
- Ruby code. Is not a preferred option, as it requires to loop over each field, and is simply too expensive.
- Prune. Does not seem to be able to deal with nested fields.
Only thing that works is removing the entire kubernetes.labels field with mutate filter. But this is not preferrable.
remove_field => ["[kubernetes][labels]"]
I don't mind if "jenkins/example" label is removed as well, that is manageable, but I would still like to keep other kubernetes.labels. Basically I need a prune filter with support for nested fields.
Even better would be if I could somehow use regex to only keep the part of the problematic field that does not include the Uuid part, instead of dropping the entire field. But at this point I will be fine with either because this has consumed far too much development time.
Any help would be appreciated. Thank you in advance.