Elastic document count mismatch - SQL to ES

Hi,

I am trying to bulk index from MS-SQL data to ES (5.6.3). The SQL query in SSMS gives 16284 as total count but after getting indexed, Kibana gets the total count as 15768.However the stats in API gets shows both count in two differnt places. Please see below and let me what is the issue and how do I get the data count correctly in Kibana?

Also read some template mapping before indexing, still didnt help.

https://discuss.elastic.co/t/bulk-index-cannot-change-type-for-index-with-more-than-one-shard/68560

{
"_shards": {
"total": 10,
"successful": 5,
"failed": 0
},
"_all": {
"primaries": {
"docs": {
"count":15768,
"deleted": 0
},
"store": {
"size_in_bytes": 69894322,
"throttle_time_in_millis": 0
},
"indexing": {
"index_total": 16284,
"index_time_in_millis": 28164,
"index_current": 0,
"index_failed": 0,
"delete_total": 0,
"delete_time_in_millis": 0,
"delete_current": 0,
"noop_update_total": 0,
"is_throttled": false,
"throttle_time_in_millis": 0
},

Strange. I’d say that may be some documents have been rejected or some documents are actually updates (using the same id)?

Thanks for the reply. Something weird I noticed. With document_id matched with a column in my SQL table, the count is mismatching, whereas if I leave it to ES with its own uuid, the count matches. I have attached the LS file here for your reference. Also i verified the mtxnid is a PK on the sql side, so there couldnt be a duplicate value.

input {
    jdbc {
        jdbc_connection_string => "jdbc:sqlserver://xxxx:1111;databaseName=udsihfuid"
        jdbc_user => "user"
	    jdbc_password => "passwd"
        jdbc_validate_connection => true
        jdbc_driver_library => "c:/users/downloads/sqljdbc4.jar"
        jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
        statement_filepath => "C:\ELK\5.6.3\logstash-5.6.3\config\jdbc_query7"
	#jdbc_page_size => 1000
    }
}
output {
    elasticsearch {
	hosts => "esserver:9200"
        index => "txns_qa_sample1"
        document_id => "%{mtxnid}"
    }

    stdout {
	codec => dots {}
    }

}

The query file is
SELECT Txn.TransactionID mtxnid FROM Transactions Txn INNER JOIN table1 (nolock) ON table1.TransactionID= Txn.TransactionID INNER JOIN table2 (nolock) ON table2.TransactionID = Txn.TransactionID

Also i verified the mtxnid is a PK on the sql side, so there couldnt be a duplicate value.

I think you need to double-check what your actual query is returning:

SELECT Txn.TransactionID mtxnid FROM Transactions Txn INNER JOIN table1 (nolock) ON table1.TransactionID= Txn.TransactionID INNER JOIN table2 (nolock) ON table2.TransactionID = Txn.TransactionID

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.