Logstash deletes source files and not creating index - date parsing issue

derekmizak · November 15, 2023, 5:39pm

I am using Elastic and logstash 8.11 running in docker.

When logstash starts it deletes log files from the source directory but nothing is passed to elastic - no index is created.

I am not sure why logstash is deleting source files in the first place.

I would apreciate if someone could help to diagnose this.

My logstash.conf is:

input {
  file {
    mode => "read"
    path => "/usr/share/logstash/ingest_data/**/**/*.json"
    codec => "json_lines"
    start_position => "beginning"
    sincedb_path => "/dev/null"
    file_chunk_size => 1048576
  }
}

filter {
  json {
    source => "message"
    target => "parsed_data"  # Using 'parsed_data' as a namespace to avoid conflicts
  }
  date {
    match => ["[fields][@timestamp]", "UNIX_MS"]
    target => "@timestamp"
  }

  mutate {
    copy => { "[fields][source]" => "source" }
    copy => { "[fields][eventType]" => "eventType" }
    copy => { "[fields][category]" => "category" }
#    remove_field => ["fields"]  # Optional: remove the original 'fields' object if it's no longer needed
  }
output {
  elasticsearch {
    index => "logstash-%{+YYYY.MM.dd}"
    hosts => "${ELASTIC_HOSTS}"
    user => "${ELASTIC_USER}"
    password => "${ELASTIC_PASSWORD}"
    cacert => "certs/ca/ca.crt"
  }
}

I am trying to parse logs like this:

{"_index":".ds-logs-dsf-2023.07.31-000018","_id":"24-12115677111-1690555439","_score":6.2382417,"fields":{"eventId":[100],"@timestamp":["1690888439226"],"source":["manager"],"eventType":["information"],"category":["rtp.document.Service.main.checks.Validator"],"message":["Basic validation passed. ."]}}

leandrojmp · November 15, 2023, 6:00pm

Because you are using it in the read mode, and when mode is set to read, it will use use the setting file_completed_action, which as the default will delete the file after processing.

Also, in read mode, the setting start_position is ignored, as described in the documentation.

Did you check Logstash logs? Do you have anything in Logstash logs? please share the logs.

derekmizak · November 15, 2023, 6:20pm

@leandrojmp - yes you right about read mode - I have chnaged it to tail now. It is not deleting anything.

In the logs I have:

It can connect to elastic

023-11-15 18:14:24 [2023-11-15T18:14:24,863][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"https://elastic:xxxxxx@es01:9200/"}
2023-11-15 18:14:24 [2023-11-15T18:14:24,874][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch version determined (8.11.0) {:es_version=>8}
2023-11-15 18:14:24 [2023-11-15T18:14:24,874][WARN ][logstash.outputs.elasticsearch][main] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>8}
2023-11-15 18:14:25 [2023-11-15T18:14:25,026][INFO ][logstash.codecs.jsonlines][main][98e3ef207a4ae9d98a067a1a59b68f9c428884fa2315628551b7969ffef13295] ECS compatibility is enabled but `target` option was not specified. This may cause fields to be set at the top-level of the event where they are likely to clash with the Elastic Common Schema. It is recommended to set the `target` option to avoid potential schema conflicts (if your data is ECS compliant or non-conflicting, feel free to ignore this message)
2023-11-15 18:14:25 [2023-11-15T18:14:25,713][INFO ][logstash.codecs.jsonlines][main][98e3ef207a4ae9d98a067a1a59b68f9c428884fa2315628551b7969ffef13295] ECS compatibility is enabled but `target` option was not specified. This may cause fields to be set at the top-level of the event where they are likely to clash with the Elastic Common Schema. It is recommended to set the `target` option to avoid potential schema conflicts (if your data is ECS compliant or non-conflicting, feel free to ignore this message)

I have also this and nothing more:

2023-11-15 18:14:40 [2023-11-15T18:14:40,394][INFO ][logstash.outputs.elasticsearch][main] Using a default mapping template {:es_version=>8, :ecs_compatibility=>:v8}
2023-11-15 18:16:02 [2023-11-15T18:16:02,119][INFO ][logstash.codecs.jsonlines][main][98e3ef207a4ae9d98a067a1a59b68f9c428884fa2315628551b7969ffef13295] ECS compatibility is enabled but `target` option was not specified. This may cause fields to be set at the top-level of the event where they are likely to clash with the Elastic Common Schema. It is recommended to set the `target` option to avoid potential schema conflicts (if your data is ECS compliant or non-conflicting, feel free to ignore this message)

leandrojmp · November 15, 2023, 6:31pm

And what is the result of the following requests on Kibana Dev Tools:

GET _cat/indices?v

and

GET logstash-*/_search

Also a couple of things about your logstash configuration.

If your source file is composed by line delimited json, for example, each line is a json document, you should not use the json_lines codec, but the json codec, this is in the documenation as well.

NOTE: Do not use this codec if your source input is line-oriented JSON, for example, redis or file inputs. Rather, use the json codec.

Another thing is, if you are using the codec in the input, you do not need the json filter as your message will already be parsed.

I would recommend to not use a codec in the input and rely on the json filter, and if you choose to do that, you need to add the top-level [parsed_data] to your filters, as you are parsing the json into a target field, so instead of [fields][anything] you need to use [parsed_data][fields][anything].

derekmizak · November 15, 2023, 6:56pm

@leandrojmp - You are a star - all is working now.
I have chnaged logstash.conf as per your recommendation and it is working perfectly.
I have one more question to you if you dont mind.

I wanted to parse timestamp date which in my case is "@timestamp":["1690888439226"] - this is why I have included filter for it:

date {
    match => ["[fields][@timestamp]", "UNIX_MS"]
    target => "@timestamp"
  }

However it is parsed to

parsed_data.fields.@timestamp
1690978392640

But it doesn't decode the timestamp so it is kind of hard searching through it. Could you point me in to a direction what can I do.

input {
  file {
    mode => "tail"
    path => "/usr/share/logstash/ingest_data/**/**/*.json"
    start_position => "beginning"
    sincedb_path => "/dev/null"
    file_chunk_size => 1048576
  }
}

filter {
  json {
    source => "message"
    target => "parsed_data"  # Using 'parsed_data' as a namespace to avoid conflicts
  }
  date {
    match => ["[fields][@timestamp]", "UNIX_MS"]
    target => "@timestamp"
  }

  mutate {
    copy => { "[parsed_data][fields][source]" => "source" }
    copy => { "[parsed_data][fields][eventType]" => "eventType" }
    copy => { "[parsed_data][fields][category]" => "category" }
#    remove_field => ["fields"]  # Optional: remove the original 'fields' object if it's no longer needed
  }
}

Badger · November 15, 2023, 7:34pm

Use match => ["[parsed_data][fields][@timestamp]", "UNIX_MS"]

derekmizak · November 15, 2023, 7:50pm

@Badger I have tried it. I have also addedd target to a diferent field:

But new firld event_date is not created after that. If I dont specify target or add target => @timestamp it doesn't parse date. Not sure what is wrong with it.

filter {
  json {
    source => "message"
    target => "parsed_data"  # Using 'parsed_data' as a namespace to avoid conflicts
  }
  date {
    match => ["[parsed_data][fields][@timestamp]", "UNIX_MS"]
    target => "[parsed_data][fields][event_date]"
    
  }

  mutate {
    copy => { "[parsed_data][fields][source]" => "source" }
    copy => { "[parsed_data][fields][eventType]" => "eventType" }
    copy => { "[parsed_data][fields][category]" => "category" }
#    remove_field => ["fields"]  # Optional: remove the original 'fields' object if it's no longer needed
  }
}

derekmizak · November 15, 2023, 7:52pm

I am getting "_dateparsefailure" appended to the document.

leandrojmp · November 15, 2023, 8:28pm

Since your source document looks like this, it seems that the field is an array.

Try to use [parsed_data][fields][@timestamp][0] in the date filter.

derekmizak · November 15, 2023, 8:46pm

After I parse json input I have date filed stored as

"parsed_data.fields.@timestamp.keyword": [
      "1690883634738"
    ],

Oryginal log is like this:

{"_index":".logs-rtp-2023.07.31-000018","_id":"24-12116678855-1690888439","_score":6.2382417,"fields":{"eventId":[100],"@timestamp":["1690888439226"],"source":["dmanager"],"eventType":["information"],"category":["Service.Common.Validator"],"message":["Basic validation passed. "]}}

My filter I have tried was:

date {
match => ["[@timestamp][0]", "UNIX_MS"]
target => "human_readable_timestamp"
tag_on_failure => ["_timestamp_parse_failed"]
}

And this one is not showing error but date is not parsed and human_readable_timestamp firld is not generated unfortunately.

derekmizak · November 15, 2023, 8:54pm

Once I have changed my filter to:

date {
    match => ["[parsed_data][fields][@timestamp][0]", "UNIX_MS"]
    target => "@timestamp"
    tag_on_failure => ["_timestamp_parse_failed"]
  }

All is working as planned - so basically my issue with this was that initial field wasn't specified correctly.

system · December 13, 2023, 8:54pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash don't delete Files Logstash	1	850	February 24, 2020
Logstash is not creating index in Elasticsearch Logstash docker	2	615	June 23, 2022
Missing data in log - Source field is always Null Elasticsearch docker	1	582	December 23, 2020
Logstash sincedb file issue Logstash	1	488	September 17, 2020
Nothing is getting indexed Logstash	5	996	July 6, 2017

Logstash deletes source files and not creating index - date parsing issue

Related topics