I am not sure if there is another issue causing this, but I have done some testing and found an odd issue with my Logstash instance. I am trying to import a very large Postgresql database from a remote server and if I set a LIMIT in the statement of the Logstash pipeline config file, it works. I can set it to LIMIT 10 or LIMIT 1000000 and they will both work fine and start importing at the same time. If I don't set a LIMIT, it just hangs forever. If I set LIMIT ALL it also hangs forever. If I set LIMIT 100000000 it also hangs forever. I'm not sure what is causing this.
I set the logging level to debug and looked at the logs. When the LIMIT is not set, it gets to
[2017-12-01T23:42:03,221][INFO ][logstash.pipeline ] Pipeline started {"pipeline.id"=>"main"}
[2017-12-01T23:42:03,231][INFO ][logstash.agent ] Pipelines running {:count=>1, :pipelines=>["main"]}
[2017-12-01T23:42:03,651][INFO ][logstash.inputs.jdbc ] (0.049546s) SELECT version()
and then every other log after that says "pushing flush onto pipeline and just repeats endlessly.
If I set LIMIT 1000000 I get:
[2017-12-01T23:42:03,221][INFO ][logstash.pipeline ] Pipeline started {"pipeline.id"=>"main"}
[2017-12-01T23:42:03,231][INFO ][logstash.agent ] Pipelines running {:count=>1, :pipelines=>["main"]}
[2017-12-01T23:42:03,651][INFO ][logstash.inputs.jdbc ] (0.049546s) SELECT version()
[2017-12-01T23:42:04,501][INFO ][logstash.inputs.jdbc ] (0.842951s) SELECT count(*) AS "count" FROM (SELECT * FROM tracking LIMIT 1000000) AS "t1" LIMIT 1
and it proceeds to work correctly.
Here is my config file:
input {
jdbc {
jdbc_driver_library => "/home/ubuntu/postgresql-42.1.4.jar"
jdbc_driver_class => "org.postgresql.Driver"
jdbc_connection_string => "url"
jdbc_user => "user"
jdbc_password => "password"
statement => "SELECT * FROM tracking LIMIT 1000000"
jdbc_paging_enabled => "true"
jdbc_page_size => "1000"
}
}
output {
elasticsearch {
hosts => ["localhost"]
index => "index-name"
}
}