Elastic - flattening multiple fields

Hi everybody,

In one of the Elastic indices, I have a problem with some documents.
I receive JSON from filebeat and parse them with the json plugin.
But some of them contain a lot of fields.

Example response.0.field1->field28, response.1.field1->field28, response.2.field1->field28, ...

This goes up to more than response.400.*

I increased the index.mapping.total_fields.limit setting it to 5000, then to 8000 and now to 12000.

But I doubt somebody is using these fields.

So, to get rid of this and to go back to a more normal situation, I am thinking of flattening all response.0-999 fields.

Can I use a wildcard or regex as field name to mark fields as flattened in a template ?
Something like

"response.[0-9]*" : {
        "type": "flattened"
      }

Or must I be explicit and include all possible names in it ?

Another option is to prune response.000 fields with 000 upper than 10, thing I can do in logstash after parsing the json in black listing them.

Thank you for your attention and responses.

Hi everybody,

No answer after a week most probably means it is not possible to use regex to flatten fields in a template.

I look at the other solution using logstash prune filter to blacklist/drop fields with names like response.[1-9][0-9]* but following this bug, prune fails to work on nested fields:

Any idea/suggestion to remove nested fields based on their names + regexp ?

Can I use a wildcard or regex as field name to mark fields as flattened in a template ?

Could you define this mapping:

and use an ingest processor or client code to convert docs
from "response.22.field1": "foo"
to "response" : { "22.field1" : "foo" }
or "response" : { "22": {"field1" : "foo" }}

Hi Mark,

I get problems with some of the response's field, not all of them.

I got text and parse it using json filter creating a tmp_json object.
My json looks like this:

"tmp_json" : {
  "entity" : {
    "response" : {
      "status" : "value",
      "error" : "value2",
      "0" : { "field-0" : "value", ... }
      "1" : { "field-0" : "value", ... }
      "2" : { "field-0" : "value", ... }
      "3" : { "field-0" : "value", ... }
      ...
      "999" : { "field-0" : "value", ... }
    }
  }
}

This ends up to a mapping explosion.
I do not want to flatten the whole response field or re-parse it.
I would like to remove some nested fields with names like response.xx with xx greater than 9.

I tried this piece of ruby, it runs through the json but it does not remove fields

ruby {
    code => "
        begin
            keys = event.get('[tmp_json][entity][response]').to_hash.keys
            keys.each{|key|
                if ( key =~ /[1-9][0-9]/ )
                    event.remove(key)
                end
            }
        end
        "
}

Okay.

I replaced this:

event.remove(key)

with this:

event.remove('[tmp_json][entity][response][' + key + ']')

and it works.

I'll rework the regexp to be more specific.

Ah ok. I thought from your first example you had field names with dots in it.
You have proper objects but they have a mix of fields you want mapped and those you don't held at the same level.
Pretty sure that will require some custom code in your client or an ingest processor to tidy up.

Not a ruby expert I'm afraid. Probably a question for another forum.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.