Custom access_log and ECS

I have custom access_log and want to use power of ECS.
So, filebeat simply chew log file (type: log) and forward lines to logstash.
Logstash groks line and fills "user_agent.original" field. So far so good.

I made simpl pipeline in elasticsearch with:

[
  {
    "user_agent": {
      "field": "user_agent.original",
      "target_field": "user_agent"
    }
  }
]

Expecting field "user_agent.original" to be parsed by "user_agent" processor.
And through pain and source browsing learn, that "user-agent" plugin treats "user_agent.original" as {"user_agent": {"original": "...."}} which is not documented (or I miss smthng?).

The question is -- how to create {user_agent: {original: VALUE}} in logstash? seems like grok can't handle this.

A field called original within a user_agent object would be referred to as [user_agent][original] in logstash. This notation allows logstash to support periods in field names and still distinguish between an object called a.b that contains a field called c -- [a.b][c], and an object called a that contains a field called b.c - [a][b.c]

2 Likes

@Badger is right, thanks for answering!

To add context to this, for the past few years, the Elastic Stack has largely been moving away from having dots in field names.

When you look at processors in Beats and Elasticsearch ingest node, those will treat dots like user_agent.original as nesting. Since the dots already represent nesting there, Beats and Elasticsearch do not support square bracket notation.

Logstash is the only one that doesn't do this, because of backwards compatibility. For a long time, periods were used in field names, and at that time, Logstash was the only one of the 3 that could do this kind of processing.

So in short, always use nesting:

  • Nesting in Logstash is with square brackets: [user_agent][original]
  • Nesting in Beats & Elasticsearch is with dots user_agent.original

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.