I am using 5.x version of Logstash and 7.x version of Elasticsearch and Kibana, the issue I am facing is even after setting the path for the sincedb, logstash is not writing the timestamp till the point reading has been complete. Every time it Starts reading the whole data.
I'm here on this thread because my 7.0.1 Logstash/jdbc/since tasks are not working. They're creating duplicate Elasticsearch documents from a single SQL row, even after incorporating the date filters suggested above. Logstash keeps pulling records it already pulled; Kibana's Discovery pane shows the same row arriving repeatedly, but it only exists once in the DB.
More info. The journal shows repeated instances of this line:
Jul 09 09:34:04 amos.nexus-tech.local logstash[27024]: [2019-07-09T09:34:04,552][INFO ][logstash.inputs.jdbc ] (4.378876s) SELECT * FROM vw_ts_sno_events WHERE stamp >= '2019-07-09 08:54:02'
Only 1 record was created in the SQL database since 00:00:00 July 9, 2019, at precisely 8:54:02; but Kibana shows 39 records received on July 9 in the Discovery panel. Since starting this jdbc job 24 hours ago, it has inflated 16,514 SQL rows into 17,702 Elastic documents.
@Badger This also made my life easy and resolved my issue as well. Filter part ha made the things align, Further the repetitions are also removed. THANKS EVERYONE !!!
@bayardk I was also facing the issue of repetitions but now I am getting totally accurate data.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.