Because you are using it in the read mode, and when mode is set to read, it will use use the setting file_completed_action, which as the default will delete the file after processing.
Also, in read mode, the setting start_position is ignored, as described in the documentation.
Did you check Logstash logs? Do you have anything in Logstash logs? please share the logs.
@leandrojmp - yes you right about read mode - I have chnaged it to tail now. It is not deleting anything.
In the logs I have:
It can connect to elastic
023-11-15 18:14:24 [2023-11-15T18:14:24,863][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"https://elastic:xxxxxx@es01:9200/"}
2023-11-15 18:14:24 [2023-11-15T18:14:24,874][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch version determined (8.11.0) {:es_version=>8}
2023-11-15 18:14:24 [2023-11-15T18:14:24,874][WARN ][logstash.outputs.elasticsearch][main] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>8}
2023-11-15 18:14:25 [2023-11-15T18:14:25,026][INFO ][logstash.codecs.jsonlines][main][98e3ef207a4ae9d98a067a1a59b68f9c428884fa2315628551b7969ffef13295] ECS compatibility is enabled but `target` option was not specified. This may cause fields to be set at the top-level of the event where they are likely to clash with the Elastic Common Schema. It is recommended to set the `target` option to avoid potential schema conflicts (if your data is ECS compliant or non-conflicting, feel free to ignore this message)
2023-11-15 18:14:25 [2023-11-15T18:14:25,713][INFO ][logstash.codecs.jsonlines][main][98e3ef207a4ae9d98a067a1a59b68f9c428884fa2315628551b7969ffef13295] ECS compatibility is enabled but `target` option was not specified. This may cause fields to be set at the top-level of the event where they are likely to clash with the Elastic Common Schema. It is recommended to set the `target` option to avoid potential schema conflicts (if your data is ECS compliant or non-conflicting, feel free to ignore this message)
I have also this and nothing more:
2023-11-15 18:14:40 [2023-11-15T18:14:40,394][INFO ][logstash.outputs.elasticsearch][main] Using a default mapping template {:es_version=>8, :ecs_compatibility=>:v8}
2023-11-15 18:16:02 [2023-11-15T18:16:02,119][INFO ][logstash.codecs.jsonlines][main][98e3ef207a4ae9d98a067a1a59b68f9c428884fa2315628551b7969ffef13295] ECS compatibility is enabled but `target` option was not specified. This may cause fields to be set at the top-level of the event where they are likely to clash with the Elastic Common Schema. It is recommended to set the `target` option to avoid potential schema conflicts (if your data is ECS compliant or non-conflicting, feel free to ignore this message)
And what is the result of the following requests on Kibana Dev Tools:
GET _cat/indices?v
and
GET logstash-*/_search
Also a couple of things about your logstash configuration.
If your source file is composed by line delimited json, for example, each line is a json document, you should not use the json_lines codec, but the json codec, this is in the documenation as well.
NOTE: Do not use this codec if your source input is line-oriented JSON, for example, redis or fileinputs. Rather, use the json codec.
Another thing is, if you are using the codec in the input, you do not need the json filter as your message will already be parsed.
I would recommend to not use a codec in the input and rely on the json filter, and if you choose to do that, you need to add the top-level [parsed_data] to your filters, as you are parsing the json into a target field, so instead of [fields][anything] you need to use [parsed_data][fields][anything].
@leandrojmp - You are a star - all is working now.
I have chnaged logstash.conf as per your recommendation and it is working perfectly.
I have one more question to you if you dont mind.
I wanted to parse timestamp date which in my case is "@timestamp":["1690888439226"] - this is why I have included filter for it:
date {
match => ["[fields][@timestamp]", "UNIX_MS"]
target => "@timestamp"
}
However it is parsed to
parsed_data.fields.@timestamp
1690978392640
But it doesn't decode the timestamp so it is kind of hard searching through it. Could you point me in to a direction what can I do.
input {
file {
mode => "tail"
path => "/usr/share/logstash/ingest_data/**/**/*.json"
start_position => "beginning"
sincedb_path => "/dev/null"
file_chunk_size => 1048576
}
}
filter {
json {
source => "message"
target => "parsed_data" # Using 'parsed_data' as a namespace to avoid conflicts
}
date {
match => ["[fields][@timestamp]", "UNIX_MS"]
target => "@timestamp"
}
mutate {
copy => { "[parsed_data][fields][source]" => "source" }
copy => { "[parsed_data][fields][eventType]" => "eventType" }
copy => { "[parsed_data][fields][category]" => "category" }
# remove_field => ["fields"] # Optional: remove the original 'fields' object if it's no longer needed
}
}
@Badger I have tried it. I have also addedd target to a diferent field:
But new firld event_date is not created after that. If I dont specify target or add target => @timestamp it doesn't parse date. Not sure what is wrong with it.
filter {
json {
source => "message"
target => "parsed_data" # Using 'parsed_data' as a namespace to avoid conflicts
}
date {
match => ["[parsed_data][fields][@timestamp]", "UNIX_MS"]
target => "[parsed_data][fields][event_date]"
}
mutate {
copy => { "[parsed_data][fields][source]" => "source" }
copy => { "[parsed_data][fields][eventType]" => "eventType" }
copy => { "[parsed_data][fields][category]" => "category" }
# remove_field => ["fields"] # Optional: remove the original 'fields' object if it's no longer needed
}
}
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.