Why does every log use a different format for the Timestamp?

I am using Stack and Filebeat version 7.13.4 on Windows 10
I have a new log to process and it has a new way to represent the timestamp.
As soon as I figure one out, they throw me another

This one is starts as:

2021/01/31,23:59:38:00,VALUE101

I have simply tried

{
  "description" : "csv pattern",
  "processors" : [
    {
      "csv" : {
        "field" : "message",
        "target_fields" : [
          "date",
          "time",
          "another_field"
        ]
      }
    }
  ]
}

Which works in a simulation, but not in reality. The index is never created.

The filebeat log shows:
........."type":"illegal_argument_exception","reason":"Illegal character inside unquoted field at 217"

I am pretty sure it has to do with the date and time which leads me to how do I parse this date,time format into a Timestamp?

OK, So I do know what I'm doing (sort of). My pipline is working in Dev Tools.
I created a temp index and POSTed a message to it. When I search the index I get the data back and the timestamp is correctly mapped!!!!

So, the problem is not with Elasticsearch, it is in the Filebeat configuration

Which only calls out a pipeline:

output.elasticsearch:

  hosts: ["localhost:9200"]
  pipeline: history_h00_test_pipeline

  index: "fb-hist-%{+yyyy.MM.dd}"

So, what is going wrong with my Filebeat??

Thanks!

Michael

THis is the error just flooding the Filebeat logs:

2021-09-04T11:31:06.295-0500	WARN	[elasticsearch]	elasticsearch/client.go:408	Cannot index event publisher.

And when I say flooding, in just a few seconds I get 7+ files with 3500 rows of this warning

Other end of the message..

 {"type":"illegal_argument_exception","reason":"Illegal character inside unquoted field at 213"}

Illegal character in unquoted field.. at 213

What is 213?

Most likely the character position in the line / message.

If you want help I would suggest to post

Your full mapping
A full example of a log line
And full example of your ingest pipeline
And you're full filebeat.yml

By the way many of us feel your pain of every logline and every date time stamp being different formatted. The joy of Snowflakes :slight_smile:

Thanks for the help..

  1. full mapping : sorry, not sure what this is

  2. Log Line:

2021/01/31,00:00:02:00,0-0,0,0,0,0,0,0,05338000,78000109,D2880027,4C15262E,00010000,7002C080,9,2,0,0,2.850
  1. Pipeline:
PUT /_ingest/pipeline/history_h00_test_pipeline
{
  "description" : "History H00 csv pattern",
  "processors" : [
    {
      "csv" : {
        "field" : "message",
        "target_fields" : [
          "date",
          "time",
          "box_running",
          "task_id",
          "task_bool",
          "BDispatchAbortReq",
          "nDispatchState",
          "nDispatchTaskStartedBy",
          "nDispatchTaskEndCode",
          "ModuleA32BitInput",
          "ModuleA32BitOutput",
          "ModuleB32BitInput",
          "ModuleB32BitOutput",
          "ModuleBHighDIO",
          "ModuleBLowDIO",
          "nErMonErrorNumber",
          "nTurboComPhase",
          "RecoveryTriggersActive",
          "RecoveryRunning",
          "task_timer"
        ]
      }
    },
    {
      "set": {
        "field": "datetime",
        "value": "{{date}} {{time}}"
      }
    },
    {
      "date": {
        "field": "datetime",
        "formats": ["yyyy/MM/dd HH:mm:ss:SS"]
      }
    }
  ]
}

btw, this pipeline is accepting input from Dev Tools:

DELETE myindex

PUT myindex 

POST myindex/_doc?pipeline=history_h00_test_pipeline
{
  "message":"2021/02/20,23:59:37:53,0-0,0,0,0,0,0,0,05338000,78000109,D288002,4C15262E,00010000,7002C080,47,0,0,0,2.850"
}

GET myindex/_search


  1. Filebeat Yaml:
#=========================== Filebeat inputs =============================
filebeat.inputs:

- type: log

  enabled: true

  fields:
    type: H00

  paths:
    - C:\LocalLogs\History\*.H00

  exclude_lines: ['^Date']


  # Include lines. A list of regular expressions to match. It exports the lines that are
#============================= Filebeat modules ===============================

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

#==================== Elasticsearch template setting ==========================
setup.template:
  name: "fb-hist"
  pattern: "fb-hist-*"

setup.ilm.enabled: false

setup.template.settings:
  index.number_of_shards: 1

#================================ General =====================================

name: "T0031"

path.data: /ELK/filebeat-data
path.logs: /ELK/filebeat-logs

#============================== Kibana =====================================

setup.kibana:

#================================ Outputs =====================================

#-------------------------- Elasticsearch output ------------------------------

output.elasticsearch:

  # Array of hosts to connect to.
  hosts: ["localhost:9200"]
  pipeline: history_h00_test_pipeline
  
  index: "fb-hist-%{[fields.type]:other}-%{+yyyy.MM.dd}"


# Set to debug to get as much info as possible
logging.level: debug

Every software job I've ever had, for the past 30 years, eventually comes back to some problem involving time and dates..

Thanks again for helping with this

  • Michael

I will add, from the debug log, my message is in unicode. Is this normal?

  "message": "\u00002\u00000\u00002\u00001\u0000/\u00000\u00001\u0000/\u00003\u00001\u0000,\u00000\u00000\u0000:\u00000\u00000\u0000:\u00000\u00001\u0000:\u00000\u00000\u0000,\u00000\u0000-\u00000\u0000,\u00000\u0000,\u00000\u0000,\u00000\u0000,\u00000\u0000,\u00000\u0000,\u00000\u0000,\u00000\u00005\u00003\u00003\u00008\u00000\u00000\u00000\u0000,\u00007\u00008\u00000\u00000\u00000\u00001\u00000\u00009\u0000,\u0000D\u00002\u00008\u00008\u00000\u00000\u00002\u00007\u0000,\u00004\u0000C\u00001\u00005\u00002\u00006\u00002\u0000E\u0000,\u00000\u00000\u00000\u00001\u00000\u00000\u00000\u00000\u0000,\u00007\u00000\u00000\u00002\u0000C\u00000\u00008\u00000\u0000,\u00002\u00007\u0000,\u00000\u0000,\u00000\u0000,\u00000\u0000,\u00002\u0000.\u00008\u00005\u00000\u0000\r\u0000",

First Mappings are the "Schema" if you do do create one with a template a default will be created for you. If you are going to do anything at scale you should learn about them.

In your sample ingest if you want to see the mapping

GET myindex/

You are using a default template right now.. .which is OK but super not efficient, we can come back to that later.

So I just ran your complete setup above with this log file

2021/01/31,00:00:02:00,0-0,0,0,0,0,0,0,05338000,78000109,D2880027,4C15262E,00010000,7002C080,9,2,0,0,2.850
2021/01/31,00:00:03:00,0-0,0,0,0,0,0,0,05338001,78000109,D2880027,4C15262E,00010001,7002C080,9,2,0,0,2.850
2021/01/31,00:00:04:00,0-0,0,0,0,0,0,0,05338002,78000109,D2880027,4C15262E,00010002,7002C080,9,2,0,0,2.850
2021/01/31,00:00:05:00,0-0,0,0,0,0,0,0,05338003,78000109,D2880027,4C15262E,00010003,7002C080,9,2,0,0,2.850
2021/01/31,00:00:06:00,0-0,0,0,0,0,0,0,05338004,78000109,D2880027,4C15262E,00010004,7002C080,9,2,0,0,2.850

Everything work great... ingest no errors

GET fb-hist-h00-2021.09.04/_search
{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 5,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "fb-hist-h00-2021.09.04",
        "_type" : "_doc",
        "_id" : "TlQssnsBLtlbFaz6r6fI",
        "_score" : 1.0,
        "_source" : {
          "date" : "2021/01/31",
          "nDispatchTaskStartedBy" : "0",
          "agent" : {
            "hostname" : "ceres",
            "name" : "T0031",
            "id" : "a19f917f-52b5-4121-b287-5828e596c1b1",
            "ephemeral_id" : "f2db4956-8af4-4b9d-9a61-d9499a1f49f8",
            "type" : "filebeat",
            "version" : "7.14.0"
          },
          "log" : {
            "file" : {
              "path" : "/Users/sbrown/workspace/sample-data/discuss/sample-csv-log-mhare.log"
            },
            "offset" : 0
          },
          "ModuleBLowDIO" : "7002C080",
          "task_id" : "0",
          "ModuleA32BitInput" : "05338000",
          "task_timer" : "2.850",
          "ModuleB32BitInput" : "D2880027",
          "RecoveryTriggersActive" : "0",
          "datetime" : "2021/01/31 00:00:02:00",
          "ModuleB32BitOutput" : "4C15262E",
          "ecs" : {
            "version" : "1.10.0"
          },
          "host" : {
            "name" : "T0031"
          },
          "BDispatchAbortReq" : "0",
          "nTurboComPhase" : "2",
          "nDispatchTaskEndCode" : "0",
          "ModuleA32BitOutput" : "78000109",
          "task_bool" : "0",
          "RecoveryRunning" : "0",
          "message" : "2021/01/31,00:00:02:00,0-0,0,0,0,0,0,0,05338000,78000109,D2880027,4C15262E,00010000,7002C080,9,2,0,0,2.850",
          "ModuleBHighDIO" : "00010000",
          "input" : {
            "type" : "log"
          },
          "@timestamp" : "2021-01-31T00:00:02.000Z",
          "nErMonErrorNumber" : "9",
          "time" : "00:00:02:00",
          "fields" : {
            "type" : "H00"
          },
          "box_running" : "0-0",
          "nDispatchState" : "0"
        }
      },
....

NOW I just see you say your logs are in Unicode heheheh ... are your logs unicode you need to figure out the right encoding

and set in the log input

See the list but I would try this first

encoding : utf-8

I have spent accumulated probably months / years on date time programming :slight_smile:

1 Like

OK, so MS VSCode tells me the log files are utf-16, so I used
encode: utf-16 in the input section and now the message looks fine:

  "message": "2021/01/31,00:00:00:00,0-0,0,0,0,0,0,0,05338000,78000109,D2880027,4C15262E,00010000,7002C080,48,0,0,0,2.850",

and.. BINGO!
Magically, the logs are ingested

1 Like

Nicely done!
Look into templates and mappings if you are going to do these at scale.

I am looking at the index templates. Probably need to read it a few times to understand what it's trying to tell me

Thanks, so much for the help!