I am not sure why this is happening, but I have a large Postgres database I'm importing from Amazon RDS. I have set the statement => "SELECT * from mytable"
and jdbc_paging_enabled => true
and jdbc_page_size => 10000
.
The weird issue is that when I run Logstash with this config, it takes a LONG time to do anything. I can see it connects to the database, then it hangs for about 20 minutes, and then it begins to import. If I tail /var/log/logstash/logstash-plain.log
while it hangs it shows this:
[2017-12-14T16:48:10,396][INFO ][logstash.pipeline ] Pipeline started {"pipeline.id"=>"main"}
[2017-12-14T16:48:10,421][INFO ][logstash.agent ] Pipelines running {:count=>1, :pipelines=>["main"]}
[2017-12-14T16:48:15,897][INFO ][logstash.inputs.jdbc ] (0.013302s) SELECT version()
The odd thing is, if I comment out jdbc_paging_enabled => true
and jdbc_page_size => 10000
and set statement => "SELECT * from mytable LIMIT 500"
it runs instantly without any issues. Almost any limit I set in the statement itself will make it run instantly.
Can someone weigh in what is causing this? I thought jdbc_paging_enabled
was equivalent to setting LIMIT 100000
. If that's true, why would the two produce two very different results?