Sql Server to Logstash


(Kartheek Gummaluri) #1

I'm trying to index records that are fetched from sql server and inserted into elasticsearch.Actually now I have 25k records and while I'm running logstash.conf file it is indexing upto 2000 records and logstash is terminating. So, can anyone tell what might be the reason so I need to index 1 miilion records in future.

Thanks in advance
Kartheek


(Magnus Bäck) #2

Have you looked in the Logstash log for clues about the shutdown? What does your configuration look like?


(Kartheek Gummaluri) #3

Thanks @magnusbaeck for your reply and I will post the configuration file.

input {
jdbc {
jdbc_driver_library => "c:\drivers\sqljdbc4"
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_connection_string => "jdbc:sqlserver://localhost\SAILS-DM29:1433;databasename=ePrimeRx_New"
jdbc_paging_enabled => "true"
jdbc_pool_timeout => "300"
jdbc_paging_enabled => "true"
jdbc_fetch_size => 10000
connection_retry_attempts_wait_time => 5
connection_retry_attempts => 10

jdbc_user => "sa"
jdbc_password => "sails123"
statement => " SELECT  DRG.[TenantId] AS [TENANTID],DRG.[Id] AS [id],DRG.[DrugCode] AS [drugcode],DRG.[BRANDNAME] AS [DrugBandName], DRG.[QuickCode] AS [quickcode], 
              DRG.[Name] AS [drugname],DRG.[BillingNDC] AS [ndc],[DRGTYPE].Remarks AS [DRUGTYPE],
              DRG.[GenericName] AS [GenericName], DRG.[Strength] AS [Strength], DRG.[ManufactId] AS [ManufactId], DRG.[AWPPack] AS [AWPPack],
              DRG.[DirectUnitPrice] AS [DirectUnitPrice],DRG.[CostPack] AS [CostPack],DRG.[UnitPriceCost] AS [UnitPriceCost],DRG.[QtyPack] AS [QUANTYPACK],
              [DRG].[DrugFormId] AS [DrugFormId], [DRG].[DrugUnitId] AS [DrugUnitId], [DRG].[DrugClass] AS [DrugClass],
              [DRGCAT].[Name] AS [DrugCategory],
              [DRG].[IsDeleted] AS [IsDeleted], [DRG].[IsActive] AS [IsActive],
              [DRG].[CreatedDtTm] AS [DrugCreatedDate],[DRG].[ModifiedDtTm] AS [DrugModifiedDate]
              FROM  [Drug] AS [DRG]
              LEFT OUTER JOIN [dbo].[DrugForm] AS [DRGFROM] ON [DRG].[DrugFormId] = [DRGFROM].[Id]
              LEFT OUTER JOIN [dbo].[DrugUnit] AS [DRGUNIT] ON [DRG].[DrugUnitId] = [DRGUNIT].[Id]
              LEFT OUTER JOIN [dbo].[DrugType] AS [DRGTYPE] ON [DRG].[DrugTypeId] = [DRGTYPE].[Id]
              LEFT OUTER JOIN [DBO].[Drug_DrugCat] as [DRGCATEGORY] on [DRG].[ID] = [DRGCATEGORY].[DrugId]
              LEFT OUTER JOIN [DBO].[DrugCat] as [DRGCAT] on [DRGCATEGORY].[DrugCatId] = [DRGCAT].[Id] where [DRG].[ModifiedDtTm] >= :sql_last_value"
  use_column_value => false
tracking_column => "ModifiedDtTm"
	tracking_column_type => "timestamp"
record_last_run => true
clean_run => true

}
}
filter {
mutate {
add_field => { "ndcsuggest" => "%{ndc}"
"drugnamesuggest" => "%{drugname}"
"quickcodesuggest" => "%{quickcode}" }
}
}

The filter part of this file is commented out to indicate that it is

optional.

output {
elasticsearch{
index => "micromerchant1"
document_type => "drug"
action =>"update" #if want to update existing index data based on ID column
#ssl=>true # if node is on SSL
hosts => ["localhost:9200"]
doc_as_upsert => true
action => "update"
document_id => "%{id}"
}
stdout { codec => json_lines }
}


(Magnus Bäck) #4

How many unique DRG.[Id] values are there in the database?


(Kartheek Gummaluri) #5

23026


(Christian Dahlqvist) #6

Is the id field unique across all 25000 documents?


(Kartheek Gummaluri) #7

yes this is the query
SELECT COUNT(DRG.[Id] )
FROM [Drug] AS [DRG]
LEFT OUTER JOIN [dbo].[DrugForm] AS [DRGFROM] ON [DRG].[DrugFormId] = [DRGFROM].[Id]
LEFT OUTER JOIN [dbo].[DrugUnit] AS [DRGUNIT] ON [DRG].[DrugUnitId] = [DRGUNIT].[Id]
LEFT OUTER JOIN [dbo].[DrugType] AS [DRGTYPE] ON [DRG].[DrugTypeId] = [DRGTYPE].[Id]
LEFT OUTER JOIN [DBO].[Drug_DrugCat] as [DRGCATEGORY] on [DRG].[ID] = [DRGCATEGORY].[DrugId]
LEFT OUTER JOIN [DBO].[DrugCat] as [DRGCAT] on [DRGCATEGORY].[DrugCatId] = [DRGCAT].[Id]

and the result count is 23026


(Kartheek Gummaluri) #8

yes this is the query
SELECT COUNT(DRG.[Id] )
FROM [Drug] AS [DRG]
LEFT OUTER JOIN [dbo].[DrugForm] AS [DRGFROM] ON [DRG].[DrugFormId] = [DRGFROM].[Id]
LEFT OUTER JOIN [dbo].[DrugUnit] AS [DRGUNIT] ON [DRG].[DrugUnitId] = [DRGUNIT].[Id]
LEFT OUTER JOIN [dbo].[DrugType] AS [DRGTYPE] ON [DRG].[DrugTypeId] = [DRGTYPE].[Id]
LEFT OUTER JOIN [DBO].[Drug_DrugCat] as [DRGCATEGORY] on [DRG].[ID] = [DRGCATEGORY].[DrugId]
LEFT OUTER JOIN [DBO].[DrugCat] as [DRGCAT] on [DRGCATEGORY].[DrugCatId] = [DRGCAT].[Id]

and the result count is 23026


(Christian Dahlqvist) #9

If you look at a few of the documents that were indexed, what is the value of _version?


(Kartheek Gummaluri) #10

Actually this is first run and I'm dropping index everytime and I'm checking whether the document count and rows count are equal


(Christian Dahlqvist) #11

What is the value? Does it vary?


(Kartheek Gummaluri) #12

This is the count


(Christian Dahlqvist) #13

That was not what I asked. Please look at a few documents and check what the value _version is.

If the id field is unique across all records, this value should never really be greater than 1, as a higher value indicates that documents have been updated.


(Kartheek Gummaluri) #14

"@version": "1" So all indexed with @version :1 i.e total count is 2522


(Magnus Bäck) #15

yes this is the query
SELECT COUNT(DRG.[Id] )
FROM [Drug] AS [DRG]
LEFT OUTER JOIN [dbo].[DrugForm] AS [DRGFROM] ON [DRG].[DrugFormId] = [DRGFROM].[Id]
LEFT OUTER JOIN [dbo].[DrugUnit] AS [DRGUNIT] ON [DRG].[DrugUnitId] = [DRGUNIT].[Id]
LEFT OUTER JOIN [dbo].[DrugType] AS [DRGTYPE] ON [DRG].[DrugTypeId] = [DRGTYPE].[Id]
LEFT OUTER JOIN [DBO].[Drug_DrugCat] as [DRGCATEGORY] on [DRG].[ID] = [DRGCATEGORY].[DrugId]
LEFT OUTER JOIN [DBO].[DrugCat] as [DRGCAT] on [DRGCATEGORY].[DrugCatId] = [DRGCAT].[Id]

and the result count is 23026

No, that only counts the number of rows. To count the number of unique DRG.[Id] values you need:

SELECT DISTINCT COUNT(DRG.[Id]) ...

(Kartheek Gummaluri) #16

SELECT DISTINCT(COUNT(DRG.[Id]) )
FROM [Drug] AS [DRG]
LEFT OUTER JOIN [dbo].[DrugForm] AS [DRGFROM] ON [DRG].[DrugFormId] = [DRGFROM].[Id]
LEFT OUTER JOIN [dbo].[DrugUnit] AS [DRGUNIT] ON [DRG].[DrugUnitId] = [DRGUNIT].[Id]
LEFT OUTER JOIN [dbo].[DrugType] AS [DRGTYPE] ON [DRG].[DrugTypeId] = [DRGTYPE].[Id]
LEFT OUTER JOIN [DBO].[Drug_DrugCat] as [DRGCATEGORY] on [DRG].[ID] = [DRGCATEGORY].[DrugId]
LEFT OUTER JOIN [DBO].[DrugCat] as [DRGCAT] on [DRGCATEGORY].[DrugCatId] = [DRGCAT].[Id]

and the count is same as above i.e. 23026


(system) #17

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.