Logstash is not getting whole table data

Hi All,

i've been working on Elasticsearch recently. the logstash pipeline pulls the data only 10k records from the db table.

How shall i make it to pull whole table data ?

i tried both jdbc_page_size and jdbc_fetch_size , nothing works.

input {
    jdbc {
        jdbc_driver_library => "/usr/local/Cellar/logstash/8.9.0/libexec/logstash-core/lib/jars/postgresql-jdbc.jar"
        jdbc_connection_string => "jdbc:postgresql://localhost:5432/pl_itsm_stg"
        jdbc_user => "postgres"
        jdbc_password => "root"
        jdbc_driver_class => "org.postgresql.Driver"
        tracking_column => "incident_number"
        schedule => "0 * * * *" # cronjob schedule format (see "Helpful Links")
        statement => "SELECT incident_number, site_id, sub_site_id, account_id, sub_account_id, closed_at, state, short_description, description, incident_parent_id from customerdata_incident"
        jdbc_fetch_size => 100000
    }
}

filter {
	if [incident_parent_id] {
  		jdbc_streaming {
        		jdbc_driver_library => "/usr/local/Cellar/logstash/8.9.0/libexec/logstash-core/lib/jars/postgresql-jdbc.jar"
        		jdbc_connection_string => "jdbc:postgresql://localhost:5432/pl_itsm_stg"
        		jdbc_user => "postgres"
        		jdbc_password => "root"
        		jdbc_driver_class => "org.postgresql.Driver"
    			statement => "select incident_number, company, sub_site_id, site_id, account_id, sub_account_id, issue_type, incident_state, resolved_at, resolved_by, impact, state, urgency from customerdata_incident WHERE incident_number = :parent_id"
    			parameters => { "parent_id" => "incident_parent_id"}
    			target => "incident_parent_id"
  		}
 
        }
        ruby {
			code => '
    					incident_parent = event.get("incident_parent_id")
    					if incident_parent.is_a? Array
						event.set("parent_id", incident_parent[0])
                                        end
                                '

        }	
        if [parent_id] {
		mutate {
                        rename => {"parent_id" => "incident_parent_id"}
                }
        }
	mutate {
		copy => {"id" => "[@metadata][_id]"}
                
	}
}
output {
	stdout { codec => "json" }
	elasticsearch {
		hosts => ["https://localhost:9200"]
                ssl => true
                ssl_certificate_verification => false
                cacert => "/Users/umaparvathykaliappan/ca_logstash.cer"
		user => "elastic"
		password => "+JYA1pqFMlbl+ZjH9WaK"
		index => "logstash_itsm_incidents_parent"
                ilm_enabled => true
	}
}

  1. My db table has 26K record. how shall i get all 26K.
  2. if have to paginate my sql query to get all records, how shall i achieve it.

if i use django-elasticsearch-dsl search_build feature, it uses Elasticsearch builk api and pulled all 26K records.
Only when i use logstash, I can see only 10K records as count . What is that missing here in my above configuration .

Please guide me .

Thanks in advance.

my bad.

GET /_search
shows count as 10,000

but when i queried with track_total_hits: true, i got the whole table count.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.