Configure index template with custom fields

Hello Team,

I am trying to configure an index template using custom fields with filebeat 8.2.0. I tried various combinations and read through various topics on the forum and the docs, but I couldn't make it work.

The error I am getting is Connection marked as failed because the onConnect callback failed: error loading template: error creating template instance: key not found. One weird thing is I am getting this even with fields available by default, for example, azure.partition_id and azure.consumer.group.

Below is the configuration I am trying to use:

filebeat.inputs:
- type: filestream
  id: agent-logs
  paths:
  - /var/lib/docker/overlay2/*/diff/app/logs-*
  processors:
  - include_fields:
      fields: ["azure.consumer_group","azure.partition_id"]
  parsers:
  - multiline:
      type: pattern
      pattern: '^[[:space:]]+(at|\.{3})[[:space:]]+\b|^Caused by:'
      negate: false
      match: after
  fields:
    type: "MKS-tasklist-report"
    azure.partition_id: tasklist-report
    azure.consumer_group: MKS
- type: filestream
  id: agent-MKS-cpu-logs
  paths:
  - /var/lib/docker/overlay2/*/diff/app/logs-cpu-*
  processors:
  - include_fields:
      fields: ["azure.consumer_group","azure.partition_id"]
  parsers:
  - multiline:
      type: pattern
      pattern: '^[[:space:]]+(at|\.{3})[[:space:]]+\b|^Caused by:'
      negate: false
      match: after
  fields:
    type: "MKS-tasklist-cpu"
    azure.partition_id: tasklist-cpu
    azure.consumer_group: MKS
processors:
  - include_fields:
      fields: ["azure.consumer_group","azure.partition_id"]
output.elasticsearch:
  protocol: http
  hosts: ["http://172.20.118.191:9200"]
  ssl.enabled: "false"
  ssl.verification_mode: "none"
  index: agent-log-template
filebeat.config.inputs:
  enabled: true
  path: etc/*.yml
  reload.enabled: true
  reload.period: 30s
setup:
  template:
    enabled: true
    name: agent-log-template
    pattern: "agent-%{[fields.type]}" # Both these patterns has same issue
    # pattern: "agent-%{[azure.partition_id]}-%{[azure.consumer_group]}"
    ilm.enabled: false
    dashboards.index: true
    append_fields:
    - name: azure.consumer_group
      type: keyword
    - name: azure.partition_id
      type: keyword

Thank you.

This is not correct .. you are putting keys under
setup.template:

That should not be there...

Here is my fully functional version

filebeat.inputs:
- type: log
  enabled: true
  paths:
   -  /Users/sbrown/workspace/sample-data/discuss/container-mixed/*.log
  # json.keys_under_root: true
  # json.add_error_key: true
  # json.message_key: log

filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

name: "apigee-dev-beat"
tags: ["apigee-dev-beat"]

# Important if you want the dashboards to work
setup.dashboards.enabled: false ## <- Set this to true Once then take out or set to false. 
setup.dashboards.index: "api-log-*"

# Use ILM Policy
# setup.ilm.enabled: true
# setup.ilm.rollover_alias: "api-log"
# setup.ilm.pattern: "{now/d}-000001"

setup.template.settings:
  index.number_of_shards: 1
  index.number_of_replicas: 0
  index.codec: best_compression

setup.template.enabled: true  
setup.template.name: "api-log-%{[agent.version]}" 
setup.template.pattern: "api-log-%{[agent.version]}" 
setup.template.overwrite: false 
setup.ilm.policy_name: api-policy 
setup.template.append_fields:
  - name: apitimestamp
    type: date
# #    format: "strict_date_optional_time||epoch_millis||yyyy-MM-dd'T'HH:mm:ss.SSS||yyyy-MM-dd'T'HH:mm:ssZZ"

output.elasticsearch:
  hosts: ["localhost:9200"]
  index: "api-log-%{[agent.version]}"   
  # ssl.verification_mode: none
  # username: [username]
  # password: [passwrod]
  # pipeline: "geoip"


processors:
  - add_host_metadata: ~
  - add_cloud_metadata: ~

Hi @NicholasBohner Are you a bot? :slight_smile:

cc @warkolm

I have noticed a number of "bot like" responses from you today... in several threads.

If you are not a bot... apologies...

This entire forum is About the Elastic Stack

and these configuration point to Elasticsearch which is what most people on this forum are interested in.

output.elasticsearch:
  hosts: ["localhost:9200"]
  index: "api-log-%{[agent.version]}" 

@NicholasBohner Welcome to the Community!

Ok Awesome ... I just saw you basically replied the same in several topics... so we see that sometimes from Bots!

This community is about the Elastic Stack : One of the Worlds most popular Search Technologies which powers

  • Search (as you might think of it) Elasticsearch Powers many of the Webs most Popular Searches
  • Observability (Logs, Metrics, APM) -> the ELK Stack (Elasticsearch+Logstash + Kibana) Is probably the defacto Open Source Logging Solutions and Has been downloaded more than 1 Billion Times
  • Security (Cyber Security) : SIEM, Endpoint Cyber Security at scale

We have a hugely popular community and The Basic /Free version of the software is incredibly powerful and popular, we have tons of open content on lots of topics!

I would take a look at our site... or just spin up a cloud trial and / as some specific question!

Looking forward to hearing from you!

Hi Stephen,

Isn't these two the same in YAML?

setup.template:
  append_fields:
  - name: apitimestamp
    type: date

and

setup.template.append_fields:
  - name: apitimestamp
    type: date

And, in your example, the field apitimestamp is not used in the template pattern like this - api-log-%{[fields.apitimestamp]}-%{[agent.version]}. Is this possible?

Thank you.

It just adds fields to the template... That all.

setup.template.append_fields
A list of fields to be added to the template and Kibana index pattern. This setting adds new fields. It does not overwrite or change existing fields.

This setting is useful when your data contains fields that Filebeat doesn’t know about in advance.

These fields do not belong there

There is a limited number of settings...

The dashboard and ILM settings are under there own settings.

Hi Stephen,

Okay. I have corrected the ILM and dashboard settings. And this is the latest configuration I am trying. My main goal is use custom fields and use them in the template, and have elasticsearch create the indices automatically based on the field values similar to the default filebeat-%{[agent.version]}. Is it possible or do I have to do any additional steps on the elasticsearch side as well?

filebeat.inputs:
- type: filestream
  id: agent-cls-logs
  paths:
  - /var/lib/docker/overlay2/*/diff/app/agent-logs-*
  processors:
  - include_fields:
      fields: ["hrc.agent","hrc.product"]
  fields_under_root: true
  parsers:
  - multiline:
      type: pattern
      pattern: '^[[:space:]]+(at|\.{3})[[:space:]]+\b|^Caused by:'
      negate: false
      match: after
  fields:
    type: "MKS-worklist-report"
    hrc:
      agent: worklist-report
      product: MKS
processors:
  - include_fields:
      fields: ["hrc.agent","hrc.product"]
output.elasticsearch:
  protocol: http
  hosts: ["http://172.20.118.191:9200"]
  ssl.enabled: "false"
  ssl.verification_mode: "none"
  index: "agent-%{[hrc.product]}-%{[hrc.agent]}"
filebeat.config.inputs:
  enabled: true
  path: etc/*.yml
  reload.enabled: true
  reload.period: 30s
setup.template.enabled: true
setup.template.name: "agent-%{[hrc.product]}-%{[hrc.agent]}"
setup.template.pattern: "agent-%{[hrc.product]}-%{[hrc.agent]}"
setup.template.append_fields:
  - name: hrc.agent
    type: keyword
  - name: hrc.product
    type: keyword
setup.ilm.enabled: false
setup.dashboards.index: true
setup.dashboards.enabled: true

Thank you for helping out.

Hi Stephen,

Somehow I got it working, but I don't understand what is happening. Maybe I would need to get a deeper understanding of filebeat and elasticseach.

Changes I have made:

  • Use the custom fields only in the output.elasticsearch.index
  • setup.template.name: "agent-logs"
  • setup.template.pattern: "agent-logs-*"

With these changes, a hidden index is created with the name .ds-agent-logs-xxxx and the data stream agent-logs-xxxx. I am unsure if this is a good or bad thing. But I will stop here and spend some time understanding these concepts.

Thank you for helping out.

That is exactly how it show work I suggest you read about Data Streams if you want to understand better.

This is not going to work adding fields to the template add fields inside the template adds their mapping to the template to make those mappings available when actually ingesting log documents. and has nothing to do with the template name nor index name.

And BTW I hope you are running
filebeat setup -e

Setting up the template happens at setup time there are no actual fields from your data available at that time....

[agent.version] is available as it is part of the filebeat metadata that IS available at setup time... fields in the actual logs are NOT available at setup time so that is static other than using some very specific fields that are available a setup time

Index names CAN be driven from fields in the data but then you are going to break / not be writing to the Data Stream that you created at setup time.

Think of the Data Stream name as Static not Dynamic.

Say If you wanted to do this? That will not work ... those fields are not available at setup time

setup.template.name: "agent-%{[hrc.product]}-%{[hrc.agent]}"
setup.template.pattern: "agent-%{[hrc.product]}-%{[hrc.agent]}"

So say you had 2 products and 2 agents you would have setup 4 data streams manually outside of the filebeat.yml knowing those values in order to use the below dynamically. You would set them up through the Kibana UI. (or you could set them up by fixing the filebeat.yml and running setup 4 times then comment out all the stuff you created) and the set some other settings to ignore the template stuff ...

But I am confused these are all static variables so you could just do static so I am not sure what you are trying to accomplish

  fields_under_root: true
  fields:
    type: "mks-worklist-report"
    hrc:
      agent: worklist-report
      product: mks

Then you could use this IF these fields are available at runtime.... which it looks like you are trying to with the include_processor .. but they are static.. so

index: "agent-%{[hrc.product]}-%{[hrc.agent]}"

BTW index / data stream names can only be lower case....

Here is another version... with just your static fields

# https://discuss.elastic.co/t/filebeat-8-3/314630/8?u=stephenb
filebeat.inputs:
- type: log
  enabled: true
  paths:
   -  /Users/sbrown/workspace/sample-data/discuss/container-mixed/sample.log
  # json.keys_under_root: true
  # json.add_error_key: true
  # json.message_key: log

  fields_under_root: true
  fields:
    type: "mks-worklist-report"
    hrc:
      agent: worklist-report
      product: mks

filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

name: "apigee-dev-beat"
tags: ["apigee-dev-beat"]

# Important if you want the dashboards to work
setup.dashboards.enabled: false ## <- Set this to true Once then take out or set to false. 
setup.dashboards.index: "api-log-*"

# Use ILM Policy
# setup.ilm.enabled: true
# setup.ilm.rollover_alias: "api-log"
# setup.ilm.pattern: "{now/d}-000001"

setup.template.settings:
  index.number_of_shards: 1
  index.number_of_replicas: 0
  index.codec: best_compression

setup.template.enabled: true  
setup.template.name: "api-log-mks-worklist-report" 
setup.template.pattern: "api-log-mks-worklist-report" 
setup.template.overwrite: false 
setup.ilm.policy_name: api-policy 
setup.template.append_fields:
  - name: apitimestamp
    type: date
# #    format: "strict_date_optional_time||epoch_millis||yyyy-MM-dd'T'HH:mm:ss.SSS||yyyy-MM-dd'T'HH:mm:ssZZ"

output.elasticsearch:
  hosts: ["localhost:9200"]
  index: "api-log-mks-worklist-report"   
  # ssl.verification_mode: none
  # username: [username]
  # password: [passwrod]
  # pipeline: "geoip"


processors:
  - add_host_metadata: ~
  - add_cloud_metadata: ~
  - include_fields:
      fields: ["hrc.agent","hrc.product"]

So if you gave us a very explicit what you want to do we might be able to help move. Seems like you want to route the data to product and agent... makes some sense BUT what many people do is just sent to a common data stream and filter on the query side.

There are valid reasons ... but when you are just learning you might be creating more pain than gain :slight_smile:

You are doing 90% correct but just a little unclear on the concepts.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.