Ingest pipeline merge subfields to JSON string?

jsosic · March 9, 2020, 10:41pm

Hi guys,

I have my apps set to write JSON logs, and filebeat sends them enriched with host metadata to Elasticsearch cluster.

There they get processed by ingest pipeline which basically runs the following GROK:

    "processors": [
        {  
            "json": {
                "field": "message",
                "target_field": "apps"
            }  
        },

Now, the problem that I'm experiencing is that some of the fields in the json go 10+ levels deep and there's a huge amount of them, which causes the mapping explosion very soon after the index creation (within the first ~10k entries).

I know the exact names of the fields that are culprit, for example: app.payload.params has a bunch of subfields and levels, which get expanded to something like:

app.payload.params.001
app.payload.params.002
...
app.payload.params.100
...
app.payload.params.foo.001
...
app.payload.params.foo.002

Now, I would like to either limit the JSON processor to the depth of processing JSON, but reading the docs that doesn't seem possible. Another option I was thinking about was trying to merge all these fields back into one text field, but it seems Elastic doesn't support ruby processor?

So that means I need to run logstash cluster?

Are there any other options?

spinscale · March 10, 2020, 1:25pm

you could use a script processor for that.

Also, you may want to check out the newly added flattened datatype, see https://www.elastic.co/guide/en/elasticsearch/reference/7.6/flattened.html

jsosic · March 12, 2020, 1:50am

I upgraded to 7.6, and just tried flattened. This is the error in filebeat that I get:

 Connection marked as failed because the onConnect callback failed: error loading template: error creating template from file /etc/filebeat/fields-custom.yml: incorrect type configuration for field 'params': unexpected type 'flattened' for field 'params'

This is the relevant part of the field-custom.yml:

    - name: params
      type: flattened
      description: Some desc.

system · April 9, 2020, 1:50am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Flatten nested fields in a json object Logstash	1	3047	July 6, 2017
Define upload fields from json file Logstash	7	1439	July 6, 2017
Multiline flat json log file to nested json in Elasticsearch Logstash	1	456	December 4, 2018
Moving JSON inner field to root level Beats filebeat	2	1012	September 20, 2018
Filebeat shipped json file : how to get needed fields? Logstash	2	860	July 6, 2017

Ingest pipeline merge subfields to JSON string?

Related topics