Pull new enteries from distributed Cassandra database

ayush1 · June 1, 2017, 1:21pm

Hi all,

I'm trying to setup Logstash to continuously pull information from Cassandra, in the same fashion it works with file inputs. I'm using the Cassandra JDBC Driver74 and I can successfully pull the contents. But like sql I can not use last value parameter to pull latest value. So after running the query once I wanted to pull only new entries.
Here is my current config:

input {
jdbc {
jdbc_connection_string => "jdbc:cassandra://hostname:9160"
schedule => "* * * * *"
jdbc_user => "cassandra"
jdbc_password => "cassandra"
jdbc_driver_library => "$PATH/cassandra_driver.jar"
jdbc_driver_class => "com.dbschema.CassandraJdbcDriver"
statement => "SELECT * FROM table_decision"
}
}

output {
elasticsearch{
hosts => ["localhost"]
document_id => "%{index_pk}"
index => "logstash-2017-01-03"

stdout {
codec => json_lines
}
}

I am using ELK to view logs in real time that are published in Cassandra database. So I am able to view them but as the size of the tables goes to millions I face latency. docement_id helps me to avoid duplicate data but whenever I am deleting all the documents in that index, logstash again pull the entire table. So is there anyway to pull only new entries.

system · June 29, 2017, 1:22pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to continuously pull from Cassandra into Logstash Logstash	4	8034	July 6, 2017
Pulling logs stored in a database into Logstash Logstash	5	2763	July 6, 2017
Unable to push incremental data from Cassandra DB to daily index using logstash Logstash	25	1343	March 12, 2019
Need to pull cassandra db latest values Logstash	2	318	May 7, 2018
Tail database table entries Logstash	3	864	April 11, 2017

Pull new enteries from distributed Cassandra database

Related topics