Unable to push incremental data from Cassandra DB to daily index using logstash

Nice!

It looks like it is working to me.

The time_stamp value from the previous run is being insert into the next statement by substitution.

Now you can increase the limit to 5 LIMIT 5 and let us verify that the fifth time_stamp is used in the statement on the next scheduled run.

Please verify that the contents of /opt/elk/.logstash_jdbc_last_run is being updated after each scheduled run.

At this point you should check to ensure that the results are ordered by time_stamp ascending, this is important to guarantee a continually increasing time_stamp value (the assumption of the > condition).

Hello,

Do we have any option to sort date time in same logstash conf file??
Cassandra sorting does not work for me with error order by and IN can not work together:
select * from table where host_name IN ('','*') AND time_stamp > : sql_last_value ORDER BY time_stamp ASC LIMIT 5 Allow filtering;

If I dont use host_name where condition , it throws an error :
Order By only supported when partition key is restricted by EQ or IN.

Ordering within Logstash is not possible for a few reasons:

  1. By design, each event is treated as autonomous. The pipeline does not care whether is sees 1 event or 600 billion events and especially does not attempt to reason (hold state) about what happened before any single event.
  2. The pipeline is multi-threaded, out of order execution is the norm and one worker (thread) is assumed do its work sooner or later than another worker starting at about the same time.

Thanks.

In case my ordering by issue doesnt fix, will it push same records again which are already indexed?
or it will remember about the records are already processed.

There is no inherent remembering of ids.

This blog post explains your options regarding duplicates.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.