I am not sure why this is happening, but I have a large Postgres database I'm importing from Amazon RDS. I have set the statement => "SELECT * from mytable" and jdbc_paging_enabled => true and jdbc_page_size => 10000.
The weird issue is that when I run Logstash with this config, it takes a LONG time to do anything. I can see it connects to the database, then it hangs for about 20 minutes, and then it begins to import. If I tail /var/log/logstash/logstash-plain.log while it hangs it shows this:
[2017-12-14T16:48:10,396][INFO ][logstash.pipeline        ] Pipeline started {"pipeline.id"=>"main"}
[2017-12-14T16:48:10,421][INFO ][logstash.agent           ] Pipelines running {:count=>1, :pipelines=>["main"]}
[2017-12-14T16:48:15,897][INFO ][logstash.inputs.jdbc     ] (0.013302s) SELECT version()
The odd thing is, if I comment out jdbc_paging_enabled => true and jdbc_page_size => 10000 and set statement => "SELECT * from mytable LIMIT 500" it runs instantly without any issues. Almost any limit I set in the statement itself will make it run instantly.
Can someone weigh in what is causing this? I thought jdbc_paging_enabled was equivalent to setting LIMIT 100000. If that's true, why would the two produce two very different results?