Hello.
The doc notes record in kafka topic will be deleted immediately .
it’s going to be immediately discarded since the timestamp value is before the last 7 days.
However, kafka is deleting the segment according to log.retention.check.interval.ms . Thus I thought records are not deleted immediately.
I have verified this comparing the timestamp of filebeat output and kafka-console-consumer.sh . By immediately consuming after topic is created and checked that consumer was able to consume messages.
# Filebeat
2022-03-10T11:33:36.159Z INFO [publisher_pipeline_output] pipeline/output.go:151 Connection to kafka(b-1.hoge.uui4gf.c2.kafka.ap-northeast-1.amazonaws.com:9092,b-2.hoge.uui4gf.c2.kafka.ap-northeast-1.amazonaws.com:9092,b-3.hoge.uui4gf.c2.kafka.ap-northeast-1.amazonaws.com:9092) established
# kafka-console-consumer
[ssm-user@ip-10-0-10-48 kafka_2.13-2.7.0]$ bin/kafka-console-consumer.sh --bootstrap-server b-1.hoge.uui4gf.c2.kafka.ap-northeast-1.amazonaws.com:9092 --topic i-02abf41b2811cc402-google-workspace --from-beginning --property print.timestamp=true &> messages.log
ssm-user@ip-10-0-10-48 kafka_2.13-2.7.0]$ ls -l
total 2972
-rw-r--r-- 1 ssm-user ssm-user 29975 Dec 16 2020 LICENSE
-rw-r--r-- 1 ssm-user ssm-user 337 Dec 16 2020 NOTICE
drwxr-xr-x 3 ssm-user ssm-user 4096 Dec 16 2020 bin
drwxr-xr-x 2 ssm-user ssm-user 4096 Dec 16 2020 config
drwxr-xr-x 2 ssm-user ssm-user 8192 Mar 3 01:35 libs
-rw-rw-r-- 1 ssm-user ssm-user 2984784 Mar 10 11:34 messages.log
drwxr-xr-x 2 ssm-user ssm-user 44 Dec 16 2020 site-docs
With broker , I have used AWS MSK kafka ver 2.7.0 only changing below broker settings.
auto.create.topics.enable = true
delete.topic.enable = true
num.partitions = 1
default.replication.factor = 3
So I thought something like ,
depending on log.retention.hours setting (default 7days), it will be discarded immediately after the log cleaner takes in place.
May I ask for review with my thought ?
Best Regards,
Yu Watanabe