Custom filebeat template for JSON log lines

Greetings,

I'm trying to use filebeat to ingest a log file full of JSON objects. I've gotten it to work and it will ingest the data and I can discover the data in Kibana almost correctly. My problem is that the data that's shown is not of the right data types.

For example, some of the JSON elements are IP addresses. When I use filebeat to parse the JSON record using -type: log + json.keys_under_root: true (in filebeat.yml), they become strings.

I have a JSON template file that I had used to create a template via the ElasticSearch API (PUT /_templates/elk-dev) but filebeat seems to ignore it.

Does anyone have any insight into this?

Thank you in advance!

Edit: more info
Relevant bits of my filebeat.yml:

filebeat.prospectors:

- type: log
  paths:
    - /var/log/elk-dev/*
  json.keys_under_root: false
  json.add_error_key: true
  document_type: json
  fields:
    type: custom_json
    codec: json

Relevant bits of my fields.yml:

- key: log
  title: Log file content
  description: >
    Contains log file lines.
  fields:
    - name: log.source
      type: keyword
      required: true
      description: >
        The file from which the line was read. This field contains the absolute path to the file.
        For example: `/var/log/system.log`.

    - name: log.offset
      type: long
      required: false
      description: >
        The file offset the reported line starts at.

    - name: log.message
      type: text
      ignore_above: 0
      required: true
      description: >
        The content of the line read from the log file.
      fields:
        - name: destination-ip
          type: ip
          required: false
          description: >
            IP address
    - name: destination-ip
      type: ip
      required: false
      description: >
           IP address

(the destination-ip field exists in two levels because I didn't know which one would work so I tested both at the same time. I would delete one or the other if it had worked but it did not)

Here's a log line example:

{"event-second": 1321132924, "event-microsecond": 383996, "signature-id": 1, "priority": 1, "sport-itype": 52378, "dport-icode": 80, "protocol": 6, "vlan-id": 0, "source-ip": "10.0.1.6", "destination-ip": "8.8.8.8", "length": 382}

Filebeat uses fields.yml (fields definition in YAML used to build docs, kibana index mapping and Elasticsearch template mappings).

With 6.0 templates are versioned. You can find some more information on template loading here and intrsuctions for generating a json template + manual installation here.

I see. That's what my research had led me to believe.

To clarify: that would mean having to convert my template.json to
fields.yml, correct? (It isn't the only way but it's the most direct, yes?)

Is there any official documentation on how to write my own fields.yml? I
can sort of gather how to do it based on the included fields.yml but I'd
like to be sure about how it interacts with filebeat.yml.

Thank you for your response!

1 Like

You can still use a json template. Using filebeat export template --es.version ..., you can extract the template filebeat will generate. But then you will have to disable template loading from filebeat and install the json based template using curl.

I can't find fields.yml being documented. Related github ticket.

It's funny: I did try to export the template using filebeat export template > filebeat.template.json. I changed the appropriate field types then I imported it to a new index and loaded a copy of my JSON log data. The data was still being written as a "text" type instead of an "ip" type. I wasn't sure why.

In the end, I found a way to make it work. Here is my procedure in case someone else runs into this issue before there is any official documentation on the subject:

  1. For filebeat.yml, set up the JSON fields but do not put the JSON event under root:
filebeat.prospectors:

- type: log

  enabled: true

  paths:
    - /var/log/elk-dev/*
  json.add_error_key: true

{...}

setup.template.name: "elk-dev"
setup.template.pattern: "elk-dev-*"

output.elasticsearch:
  - index: "elk-dev"
  1. Create a standard JSON template. You can base it off of filebeat export template. Here's an example of what it would look like:
{  
   "index_patterns":[  
      "elk-dev*"
   ],
   "mappings":{  
      "doc":{  
         "properties":{  
            "@timestamp":{ ... },
            "beat":{ ... },
            # BELOW IS WHERE YOU WILL DEFINE YOUR JSON FIELDS AND TYPES
            "json":{ 
                 "properties":{
                       "source-ip":{
                             "type":"ip",
                             "keyword":{
                                   "type": "keyword",
                                   "ignore_above": 256
                             }
                         },
                       ...
                       ...etc...
                       ...
             },
            "error":{ ... },
            "prospector":{ ... }
         }
      }
   }
}
  1. Use the curl API to put this JSON template into a new index:
    curl -H "Content-Type: application/json" -PUT 'http://$ELASTICSEARCH_IP:9200/_template/elk-dev?pretty' -d@filebeat.template.json

  2. Make sure that your filebeat.yml is outputting to elasticsearch and that you've set the appropriate index (in this example, it's "elk-dev" as shown in the last line in step 1)

  3. You can now ship data to this new index by doing sudo service filebeat restart.

  4. In my scenario, I had a timestamp in my JSON data which is different from @timestamp which filebeat adds automatically. When creating my index, I chose to use my own timestamp instead of filebeat's. I'm not sure this is required in some way to make the types work.


For me, the methods and suggestions I've found online did not work or I could not get them to work.
I could not find a way to modify fields.yml in a way that preserved custom field types although this seems like the most direct way of doing this.
Until then, this is my janky fix.

1 Like

A template is not the actual mapping the index uses. A template defines a mapping template. When a new index is generated, matching templates are looked up and applied to generate the mapping of the new index. Changing a template does not update the mapping of existing indices. With dynamic mapping one can still add fields to an index, that are not specified in the original template. The new fields and it's type only exists in the current index.

That is, when modifying fields.yml or the original template, you will need to delete the index or make sure you create a new index on changes (e.g. by adding a 'template' version to the index name).

Thank you Vincent! This was exactly what I needed to get started sending structured JSON logs directly to elasticsearch with filebeat. One side effect of the above procedure is that fields will be referenced as json.field in Kibana (e.g. json.source-ip following the example in your answer).

I took this a little further and figured out how to get the fields to display natively. To do this set json.keys_under_root: true in filebeat.yml and then in your mapping do not have a json section but instead put the fields directly under the "properties" section where @timestamp and beat are shown. Your example mapping updated would look like this:

"mappings":{  
      "doc":{  
         "properties":{  
            "@timestamp":{ ... },
            "beat":{ ... },
            # NOW DEFINE YOUR JSON FIELDS AND TYPES ANYWHERE IN THE TOP PROPERTIES LIST
            "source-ip":{
               "type":"ip",
               "keyword":{
                  "type": "keyword",
                  "ignore_above": 256
               }
            },
            ...
            ...etc...
            ...
            },
            "error":{ ... },
            "prospector":{ ... }
         }
      }
   }
}

FWIW I also didn't get the curl API to put my index correctly however I just saved the template then went to the Kibana console (click Dev Tools) and did a PUT elkdev and then copied and pasted the whole template after a newline (the template included the original index_patterns setting in your answer and other index settings from the output of filebeat export template > filebeat.template.json). Not sure if this would have worked if I didn't delete and recreate the index due to what Steffen noted so you should research if you want to update in other ways (saw something about reindex while looking into this stuff).

Thanks again, hope this helps you or someone else!

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.