ECS Fields Not Mapping Correctly Automatically

scottdfedorov · March 18, 2020, 9:36pm

When I index documents into Elasticsearch, the fields which conform to ECS are not being mapped correctly and all are being made into "text" fields, with "keyword" sub-fields.

What steps are needed to utilize ECS? I was under the impression from the docs that simply using a field named like fields in ECS, without other explicit mapping, would result in the fields being mapped as defined in ECS automatically. No need to define the ECS fields in your index template mapping definition...

This isn't happening. Even for fields which are coming from directly from beats without any modifications, like the agent.* fields.

Here's some more detailed info:

We have a fairly typical Beats > Logstash > Elastic use case.
Filebeat is grabbing files from a directory. It doesn't do much except add a field with environment variable. Outputs from Filebeat is to Logstash.

In Logstash, we grok the message to get some custom fields, and then mutate some fields coming from Beats to match ECS and leave others untouched.

Here's the entire filter stage from our Logstash pipeline.

Here's a selection of the filter from our pipeline.

    filter{
      grok {
        match => { "message" => "[grok pattern here]" }
      }
      date {
        match => ["timestamp", "ISO8601"]
        remove_field => ["timestamp"]
      }
      mutate{ 
    	add_field => { "[log][original]" => "%{[message]}" } 
    	add_field => { "[file][path]" => "%{[log][file][path]}" }
    	remove_field => ["[log][file]"]
      }}

Here is an example document which is being exported by Logstash:

    {
    "utility": {"name": "name"},
    "file": {path": "somepath"},
    "ecs": {"version": "1.4.0"},
    "call_in": "1F 09",
    "meter": {"irn": "12345"},
    "input": {"type": "log"},
    "host": {"name": "HOSTNAME"},
    "@timestamp": "2020-03-14T17:52:34.045Z",
    "agent": {
    	"id": "4b9f6ace-b9d0-453e-9939-95e02ec13e74",
    	"ephemeral_id": "94e9da7b-3983-492d-aecd-5f60468044fb",
    	"type": "filebeat",
    	"hostname": "HOSTNAME",
    	"version": "7.6.1"
    },
    "log": {
    	"offset": 0,
    	"flags": ["multiline"],
    	"original": "original message here"
    },
    "message": "original message here",
    "tags": ["call-in"],
    "@version": "1"
    }

This is being fed into ElasticSearch.

    output { 
    	elasticsearch{
    		hosts => ["IPADDRESSHERE:9200"]
    		ilm_rollover_alias => "ls-call-in-alias"
            ilm_pattern => "{now/d}-000001"
            ilm_policy => "ilm-logs"
    		template_name => "template-ls-call-in"
    	}

In Elastic, we have an index template with mappings defined for our custom fields. None of the ECS fields are included. Here is the entire index template (named template-ls-call-in as shown above):

    {
      "version": 0,
      "index_patterns": ["ls-call-in-*"], 
      "settings": {
        "number_of_shards": 1,
        "number_of_replicas":0,
        "index.lifecycle.name": "ilm-logs",
        "index.lifecycle.rollover_alias": "ls-call-in-alias"
      },
      "mappings": {
        "properties": {
          "call-in":{"type": "keyword"},
          "meter": {
            "properties": {
              "name":{"type": "keyword"},
              "irn":{"type": "keyword"},
              "type":{"type": "keyword"}
            }
          },
          "utility": {
            "properties": {
              "name":{"type": "keyword"}
    }}}}}

We have no other processing/pipelines being performed in elastic. We do all pipeline/processing via Logstash.

Our assumption was that these mappings would be formatted as shown, and the ECS fields would be formatted as defined in the docs, however after indexing the document into elastic, the mapping for all ECS fields is text with sub-keyword.

Here is a partial mapping, showing the agent.* fields, which were left entirely untouched by our pipeline in Logstash:

    {
      "mapping": {
        "_doc": {
          "properties": {
            "@timestamp": {
              "type": "date"
            },
            "@version": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "agent": {
              "properties": {
                "ephemeral_id": {
                  "type": "text",
                  "fields": {
                    "keyword": {
                      "type": "keyword",
                      "ignore_above": 256
                    }
                  }
                },
                "hostname": {
                  "type": "text",
                  "fields": {
                    "keyword": {
                      "type": "keyword",
                      "ignore_above": 256
                    }
                  }
                },
                "id": {
                  "type": "text",
                  "fields": {
                    "keyword": {
                      "type": "keyword",
                      "ignore_above": 256
                    }
                  }
                },
                "type": {
                  "type": "text",
                  "fields": {
                    "keyword": {
                      "type": "keyword",
                      "ignore_above": 256
                    }
                  }
                },

Any ideas why the fields aren't using ECS mapping? Did we miss a configuration step somewhere? Does ECS not work with custom defined fields (contrary to what the docs say)?

This is a brand new install of 7.6.1 on Windows. Single node "cluster" for testing/dev purposes. (We're doing a rip and replace upgrade from 2.3.2.)

Let me know if I can provide any additional clarity on the issue. This is the first time we're using ECS. I can easily just define the fields manually in our mapping, but seems like that defeats the purpose?

Thanks all!

scottdfedorov · March 18, 2020, 10:59pm

Questions above still apply, but I suddenly remembered you can have multiple index templates match, so I'm guessing I could just take the mappings from the ECS and put in a template that matches all indexes and use that in addition to the specific index template.

If that's the way we're supposed to be doing this, I failed to see it included in the documentation anywhere. If anyone knows where it would be, let me know.

webmat · March 19, 2020, 5:38pm

Hi @scottdfedorov,

Elasticsearch will not auto-detect ECS fields based on their name. Proper template(s) or mapping should be in place at all times, to ensure the correct data types are used.

I was under the impression from the docs that simply using a field named like fields in ECS, without other explicit mapping, would result in the fields being mapped as defined in ECS automatically.

If you're able to point me to the place that gave you this impression in the ECS docs, I would like to adjust this and clarify that section.

Now the proper way to send Filebeat events via Logstash depends on what you're doing.

1- If you're using Filebeat modules, you should follow the Filebeat documentation for this. The Filebeat modules come with extensive and precise templates that must be used, to ensure the correct data types are in place for all fields, and this is true even if you're sending events via Logstash first.

2- If you're using Filebeat to tail custom logs and you're doing all of the parsing yourself, you should still (IMO) send this to a Filebeat index with the proper templates in place, as many metadata fields (e.g. agent.*, host.*) added by the metadata processors need the proper template for these fields.

As you've noted, the multiple template support is a very good way to specify how your custom fields should be mapped, in addition to the template provided by Filebeat. I think it's the simplest, so

To loop back on point 2 above, if you're doing custom logs only, and you really don't want to use the Filebeat template & index, you may try your hand starting from the sample ECS template we provide here: https://github.com/elastic/ecs/tree/master/generated/elasticsearch. This will lead to a lot of trial and error and maintenance of that custom template, however, so this should be considered as a last resort.

webmat · March 19, 2020, 5:42pm

Please also note that while this sample template gives you all the correct field definitions, the settings are not optimized for production, they're optimized for experimentation. So you'll want to adjust template settings to your needs

scottdfedorov · March 19, 2020, 7:37pm

Thanks Mat, appreciate the speedy reply.

There wasn't so much an explicit mention that it was automatic, but rather the combined lack of info on implementing ECS.

In the Using ECS section, for example, there's no explicit instructions that say you need to define the map according to the spec. A simple section including that detail would've saved me a lot of time.

Additionally, when you look at the field reference, it does mention the types of the fields and gives some information, but when you compare it against the template, it doesn't include additional information like "ignore_above" being set. Perhaps that's not part of the schema, but is just part of the example template, but that's still worth including in the field reference.

Speaking of the template, it was not linked anywhere in the docs. A link to that would've saved me a LOT of time as well. I didn't find that until MUCH later. Linking that page as part of a section on implementing ECS should be included, or at least include the same info.

There are also just little things, such as the mention that "Field names must be lower case" on the Guidelines and Best Practices threw me into thinking that it was already a defined part of Elastic. The "must" implies it works if you conform, but not if you don't. When in fact, if you implement everything manually, you can do whatever and still conform to the spec. Make it all caps instead all lower-case. So long as it's uniform it works, right?

Anyway. I appreciate the time. I'm happy to put in a PR to add some info from a user perspective for Elastic to review to improve the docs. Pretty easy to forget how outside users see things differently.

Everything is working for us well now. Thanks for the help. Here's how we decided to implement. I like it and think it would help other users with adoption of ECS as well.

We have just the mappings from the ECS schema in a match all template with an order of -2.
We have also a match-all template with all our custom fields and typical settings for indexes order of -1. If anything conflicts here, it'll overwrite the ECS schema.
Then we have typical index templates which don't need to have an order number, but will further define any special fields not part of our company common schema and settings for individual indexes.

Pretty versatile. Thanks again.

webmat · March 19, 2020, 7:50pm

Yep, another thing on our mind is to improve the "getting started" experience by considering the different personas learning about ECS:

end users just looking for a field definition of what they're seeing in Kibana
implementers like you, working on pipelines (custom or not)
developers building products that are meant to adopt ECS out of the box (Elasticians, partners, open source developers)

But I hear you. We have a lot of ideas on how to improve the docs. Feel free to drop additional ideas (or potential solutions) in this public brainstorm doc: https://docs.google.com/document/d/1zIpQG8Flq1pVpV4TOGYolGfqP2lTGeVrxCbZkZsfmeo

Of course Github issues and PRs are welcome as well

system · April 16, 2020, 7:50pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Beat templates do not stick to ECS? Beats filebeat , winlogbeat , auditbeat	4	392	August 11, 2020
Elastic Common Schema Implementation Elasticsearch ecs-elastic-common-schema	5	2837	March 14, 2019
Filebeat events doesnt come with ECS compatibility while coming through Logstash Beats filebeat	12	1410	April 15, 2021
What is the best practice to convert my custom logs to ECS(Or not) Beats filebeat	3	1353	October 31, 2019
Failed to add field from ECS Logstash	2	289	June 19, 2022

ECS Fields Not Mapping Correctly Automatically

Related topics