Logstash on Docker using JDBC input plugin does not fetch all rows from SQL Server

I'm trying to index data into Elasticsearch 7.8.0 from SQL Server through Logstash 7.8.0. All works well when I use non docker version of Logstash.

I have more than 6 million rows I need to fetch and the count matches exactly. But when I run the same configuration with Logstash on Docker (official image), it only fetches a few rows (~ 3500) and then stops. There is no error in logs.

I have tried running various SQL queries. Depending on number of columns I try to ingest, the no of rows vary. For e.g. if I select only 1 column e.g ID , more rows will be imported (~ 8000) but it never fetches all of them. Also, when I use non-docker version, the logstash shuts down once it finished fetching all 6 million+ rows. But in non-docker version, it just stops without shutdown. I waited for 30 mins, but no more data is fetched.

Below is my sample logstash.conf

input {
  jdbc {
    jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
    jdbc_connection_string => "jdbc:sqlserver://<ip>:<port>;databaseName=<dbname>;"
    jdbc_user => "<user>"
    jdbc_password => "<password>" 
    statement => "SELECT id from <table> WHERE id is NOT NULL"
  }
}
output {
    elasticsearch {
        hosts => ["https://<elastichost>:9200"] 
        index => "my_index"
        document_id => "%{id}"
    }
    stdout {
      codec => rubydebug
    }
}

and here is the log

OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.headius.backport9.modules.Modules (file:/usr/share/logstash/logstash-core/lib/jars/jruby-complete-9.2.11.1.jar) to method sun.nio.ch.NativeThread.signal(long)
WARNING: Please consider reporting this to the maintainers of com.headius.backport9.modules.Modules
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Sending Logstash logs to /usr/share/logstash/logs which is now configured via log4j2.properties
[2020-07-30T23:40:13,897][INFO ][logstash.setting.writabledirectory] Creating directory {:setting=>"path.queue", :path=>"/usr/share/logstash/data/queue"}
[2020-07-30T23:40:13,906][INFO ][logstash.setting.writabledirectory] Creating directory {:setting=>"path.dead_letter_queue", :path=>"/usr/share/logstash/data/dead_letter_queue"}
[2020-07-30T23:40:14,110][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.8.0", "jruby.version"=>"jruby 9.2.11.1 (2.5.7) 2020-03-25 b1f55b1a40 OpenJDK 64-Bit Server VM 11.0.7+10-LTS on 11.0.7+10-LTS +indy +jit [linux-x86_64]"}
[2020-07-30T23:40:14,123][INFO ][logstash.agent           ] No persistent UUID file found. Generating new UUID {:uuid=>"d96a0289-4426-425c-a17f-81de075598cb", :path=>"/usr/share/logstash/data/uuid"}
[2020-07-30T23:40:15,306][INFO ][org.reflections.Reflections] Reflections took 23 ms to scan 1 urls, producing 21 keys and 41 values 
[2020-07-30T23:40:15,791][WARN ][logstash.outputs.elasticsearch][main] ** WARNING ** Detected UNSAFE options in elasticsearch output configuration!
** WARNING ** You have enabled encryption but DISABLED certificate verification.
** WARNING ** To make sure your data is secure change :ssl_certificate_verification to true
[2020-07-30T23:40:16,062][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[https://user:xxxxxx@elastichost:9200/]}}
[2020-07-30T23:40:16,341][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"https://user:xxxxxx@elastichost:9200/"}
[2020-07-30T23:40:16,370][INFO ][logstash.outputs.elasticsearch][main] ES Output version determined {:es_version=>7}
[2020-07-30T23:40:16,372][WARN ][logstash.outputs.elasticsearch][main] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>7}
[2020-07-30T23:40:16,386][INFO ][logstash.outputs.elasticsearch][main] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["https://elastichost:9200"]}
[2020-07-30T23:40:16,440][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>16, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>2000, "pipeline.sources"=>["/usr/share/logstash/pipeline/logstash.conf"], :thread=>"#<Thread:0x4f6117d run>"}
[2020-07-30T23:40:16,449][INFO ][logstash.outputs.elasticsearch][main] Using default mapping template
[2020-07-30T23:40:16,468][INFO ][logstash.outputs.elasticsearch][main] Attempting to install template {:manage_template=>{"index_patterns"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s", "number_of_shards"=>1}, "mappings"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}
[2020-07-30T23:40:17,184][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2020-07-30T23:40:17,229][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2020-07-30T23:40:17,382][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2020-07-30T23:40:19,279][INFO ][logstash.inputs.jdbc     ][main][598a8fa14ff28545975bda9c2337080827010b1648613a0fb1dca648ce4fd53a] (0.401349s) SELECT ID from table WHERE ID is NOT NULL
/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated
{
            "id"[0;37m => [0m[0;33m"A7A5F8F0-7873-4786-84AC-000023B5A705"[0m,
    "@timestamp"[0;37m => [0m2020-07-30T23:40:19.317Z,
      "@version"[0;37m => [0m[0;33m"1"[0m
}
{
            "id"[0;37m => [0m[0;33m"142FEDDF-49C7-41B3-978E-000048A8C2FC"[0m,
    "@timestamp"[0;37m => [0m2020-07-30T23:40:19.319Z,
      "@version"[0;37m => [0m[0;33m"1"[0m
}

Sounds like an output is blocked and so back-pressure is causing the pipeline to stop. Can you increase the log.level and see if there is anything informative?

I tried to bump up the log level but can't find anything interesting there.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.