Hi guys,
I have my apps set to write JSON logs, and filebeat sends them enriched with host metadata to Elasticsearch cluster.
There they get processed by ingest pipeline which basically runs the following GROK:
"processors": [
{
"json": {
"field": "message",
"target_field": "apps"
}
},
Now, the problem that I'm experiencing is that some of the fields in the json go 10+ levels deep and there's a huge amount of them, which causes the mapping explosion very soon after the index creation (within the first ~10k entries).
I know the exact names of the fields that are culprit, for example: app.payload.params
has a bunch of subfields and levels, which get expanded to something like:
app.payload.params.001
app.payload.params.002
...
app.payload.params.100
...
app.payload.params.foo.001
...
app.payload.params.foo.002
Now, I would like to either limit the JSON
processor to the depth of processing JSON, but reading the docs that doesn't seem possible. Another option I was thinking about was trying to merge all these fields back into one text field, but it seems Elastic doesn't support ruby
processor?
So that means I need to run logstash cluster?
Are there any other options?