Multiple tables as jdbc input in logstash pipeline

I have 4 tables in oracle database and have a SQL query beginning with CTS (Common Table Expressions) which combines and creates a new single table.

I want to create a pipeline using jdbc input and read date once in a day (at night).

Only one of my tables has tracking_column which is timestamp. The others have neither a timestamp nor a sequential increasing column (like unique id).

My questions are:
1- What is thr most conventional way of reading multiple tables in this case?
2- If i use an SQL statement in input, it says it should start with "Select" but i have joined single table which starts with "With CTE..." How can i solve this?
3- How can i overcome tracking_column issue?

Thanks.

Hello Murat,

I didn't have the usecase yet, but you could try if statement_filepath instead of statement solves the issue of requiring the Select keyword at the start (although I doupt it).

I think the best way to split yoour jdbc input with 4 tables into 4 inputs with a single table each. This way, you can (hopefully) rewrite your SQL to not use CTS and the tracking column can be set differently for each input.

Best regards
Wolfram

Thank you Wolfram for your suggestion.
I also used statement_filepath. But the main problem is here, 3 tables do not have a tracking column. I combined 4 of them with CTE and used the only tracking column of first table.
At least pipelined worked well but I need to check if a table (without tracking column) is changed, the others also are changed or not.

Best regards
Murat

Marut,
this is quite simple.

statement_filepath => "/etc/logstash/conf.d/sql/your_combine_sql.sql"

now on this sql file you do all your join etc..

select t1.x, t2.y, t3.z 
    from table1 t1, table2 t2, table3 t3
   where t1.id=t2.id and t2.id=t3.id and timestamp > sysdate-interval X minute

and you run this logstash every X min and you get all the data that was change in X minute. you won't miss anything

you don't have to use tracking column because you have timestamp. if anything changes in table1 you will get it because last condition.

I have these kind of setup at few place and combining many tables.

Thank you Sachin for your answer.

I used my .sql like this:

-- Combination of two tables. Only Table 1 has Timestamp

-- TABLE 1
WITH 
CTE_TABLE1 AS (SELECT T1ID, T1COL2, T1COL3, T1COLTIMESTAMP, FROM TABLE1),
-- TABLE 2
CTE_TABLE2 AS (SELECT T2ID,T2COL2, FROM TABLE2),

SELECT *
FROM (SELECT *
      FROM CTE_TABLE1 TABLE1 LEFT JOIN CTE_TABLE2 TABLE2 ON T1ID = T2ID 
      WHERE T1COLTIMESTAMP > :sql_last_value
        AND T1COLTIMESTAMP IS NOT NULL
        AND T1COLTIMESTAMP >= sysdate - 20
      ORDER BY T1COLTIMESTAMP ASC)

I am still not sure if somebody changes T2COL2, if it changes T1COLTIMESTAMP automatically. Because they are not related and TABLE 2 has no tracking column.

Best regards
Murat

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.