Using index template to read a field as date

Hi,

I'm attempting to take a JSON-based log, and use Filebeat to stick it directly into Elasticsearch. I'm trying - at this point - not to include Logstash yet, AND also attempting to get a better understanding of what Filebeat is capable of.

Whilst I can currently get the log loaded into Elasticsearch, I can't seem to get my timestamp field to be recognised as a date. I'm aware that Filebeat will add a @timestamp property that reflects when it read the log item. I don't mind that, but I also want my own timestamp property to be usable for time series analysis.

Some sample log items. I'm looking to use the eventTime property as my timestamp field.

{"eventType":"search","searchTerm":"sauce","user":175362,"eventTime":"2018-08-21T15:42:29+1000"}
{"eventType":"search","searchTerm":"goats milk","user":138297,"eventTime":"2018-08-21T15:42:29+1000"}
{"eventType":"search","searchTerm":"potatoes","user":140003,"eventTime":"2018-08-21T15:42:29+1000"}

This is what I've done, and none of them have succeeded in having the eventTime field registered as a date type.

Partial of the filebeat.yml file

output.elasticsearch:
  hosts: ['elasticsearch:9200']
  index: "search-metrics-filebeat-%{[beat.version]}-%{+yyyy.MM.dd}"

setup.template.name: "search-metrics-filebeat"
setup.template.pattern: "search-metrics-filebeat-*"
setup.template.fields: "/usr/share/filebeat/events-fields.yml"
setup.template.overwrite: true

for the events-field.yml file, it's contents were extracted the mapping from my index, I then used a JSON -> YAML converter (I've got NO idea if this is the right thing to do or not...), and then I made the following modifications:

  • set date_detection: true
  • set dynamic_date_formats: ["YYYY-MM-DDTHH:mm:ssZZ"]
  • (and finally when I decided to try the brute force approach) I added an eventTimeField under dynamic_templates that matched on the property name of eventTime
  doc:
    _meta:
      version: 6.3.2
    date_detection: true
    dynamic_date_formats: ["YYYY-MM-DDTHH:mm:ssZZ"]
    dynamic_templates:
    - eventTimeField:
        match_mapping_type: string
        match: "eventTime"
        mapping:
          type: "date"
    - fields:
        mapping:
          type: keyword
        match_mapping_type: string
        path_match: fields.*

I also tried, in the properties section, at the same level as the apache2 and docker entries, adding a date property like that:

      date:
        type: "date"
        format: "YYYY-MM-DDTHH:mm:ssZZ||yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"

As you might guess, I'm basically taking bits and pieces from where-ever and trying to figure out if it works or not...

Any chance of some help please?

Thank you

@Jonathan_Tan as you have note Filebeat automatically add a new @timestamp on every line that it read. By default this the time when the line was read, which is usually not what you want.

Since your original data is already a structured JSON document I think you should use an ingest pipeline to configure some logic on the ES side to parse the date correctly. To do that you will need the date processor and you will have to change your elasticsearch output configuration

The above is better because this will make the @timestamp field reflect the real time of the event and that field is the default field for Kibana.

If you want to keep the time read you can use the rename processor before the date processor.

Hi Pier,

Thank you for your suggestion. So just to confirm, you're essentially saying "let filebeat stream, and give the transformation to elastic search"?

I'll take a look at the ingest stuff in more detail. It does look straightforward too.

Thank you!

Hi @pierhugues,

I took your advice, stripped out my existing filebeat config attempts, and used an Elasticsearch ingest pipeline instead. That is easier and it did the trick.

That said, I've also come to realise that Elasticsearch isn't logging pipeline errors, and instead Filebeat is. And Filebeat isn't logging useful errors normally either, only in verbose mode. I see that there's already another discussion regarding Elasticsearch's error handling, so I might go make my two cents known there. :wink:

For reference for anybody else, this is what I did using the Elasticsearch ingest to read an existing field, parse it as a timestamp, and then put it into the @timestamp field:

PUT http://localhost:9200/_ingest/pipeline/search_event_pipeline, with the following post body:

{
    "description": "Takes the eventTime field and turns it into a date field",
    "processors": [
        {
            "date": {
                "field": "eventTime",
                "target_field": "@timestamp",
                "formats": [
                    "YYYY-MM-DD'T'HH:mm:ssZZ",
                    "YYYY/MM/DD HH:mm:ssZZ"
                ]
            }
        }
    ],
    "on_failure": [
        {
            "set": {
                "field": "_index",
                "value": "failed-{{_index}}"
            }
        },
        {
            "set": {
                "field": "error",
                "value": "{{_ingest.on_failure_message}}"
            }
        }
    ]
}

The above on_failure property was to eventually handle the errors, and to - more importantly - pipe them into a separate index for me to assess.

In the future, I might remove the original eventTime field as well, but I'd kept it there this time so I could make sure it was working... :wink:

In the filebeat.yml file, my elastic output was turned into:

output.elasticsearch:
  hosts: ['elasticsearch:9200']
  index: "search-metrics-filebeat-%{[beat.version]}-%{+yyyy.MM.dd}"
  pipeline: "search_event_pipeline"

Again, Pier, thanks for your help :smile:

1 Like

@Jonathan_Tan glad that it worked! Yeah we are trying to improve any errors coming from that :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.