I'm trying to index records that are fetched from sql server and inserted into elasticsearch.Actually now I have 25k records and while I'm running logstash.conf file it is indexing upto 2000 records and logstash is terminating. So, can anyone tell what might be the reason so I need to index 1 miilion records in future.
jdbc_user => "sa"
jdbc_password => "sails123"
statement => " SELECT DRG.[TenantId] AS [TENANTID],DRG.[Id] AS [id],DRG.[DrugCode] AS [drugcode],DRG.[BRANDNAME] AS [DrugBandName], DRG.[QuickCode] AS [quickcode],
DRG.[Name] AS [drugname],DRG.[BillingNDC] AS [ndc],[DRGTYPE].Remarks AS [DRUGTYPE],
DRG.[GenericName] AS [GenericName], DRG.[Strength] AS [Strength], DRG.[ManufactId] AS [ManufactId], DRG.[AWPPack] AS [AWPPack],
DRG.[DirectUnitPrice] AS [DirectUnitPrice],DRG.[CostPack] AS [CostPack],DRG.[UnitPriceCost] AS [UnitPriceCost],DRG.[QtyPack] AS [QUANTYPACK],
[DRG].[DrugFormId] AS [DrugFormId], [DRG].[DrugUnitId] AS [DrugUnitId], [DRG].[DrugClass] AS [DrugClass],
[DRGCAT].[Name] AS [DrugCategory],
[DRG].[IsDeleted] AS [IsDeleted], [DRG].[IsActive] AS [IsActive],
[DRG].[CreatedDtTm] AS [DrugCreatedDate],[DRG].[ModifiedDtTm] AS [DrugModifiedDate]
FROM [Drug] AS [DRG]
LEFT OUTER JOIN [dbo].[DrugForm] AS [DRGFROM] ON [DRG].[DrugFormId] = [DRGFROM].[Id]
LEFT OUTER JOIN [dbo].[DrugUnit] AS [DRGUNIT] ON [DRG].[DrugUnitId] = [DRGUNIT].[Id]
LEFT OUTER JOIN [dbo].[DrugType] AS [DRGTYPE] ON [DRG].[DrugTypeId] = [DRGTYPE].[Id]
LEFT OUTER JOIN [DBO].[Drug_DrugCat] as [DRGCATEGORY] on [DRG].[ID] = [DRGCATEGORY].[DrugId]
LEFT OUTER JOIN [DBO].[DrugCat] as [DRGCAT] on [DRGCATEGORY].[DrugCatId] = [DRGCAT].[Id] where [DRG].[ModifiedDtTm] >= :sql_last_value"
use_column_value => false
tracking_column => "ModifiedDtTm"
tracking_column_type => "timestamp"
record_last_run => true
clean_run => true
The filter part of this file is commented out to indicate that it is
optional.
output {
elasticsearch{
index => "micromerchant1"
document_type => "drug"
action =>"update" #if want to update existing index data based on ID column
#ssl=>true # if node is on SSL
hosts => ["localhost:9200"]
doc_as_upsert => true
action => "update"
document_id => "%{id}"
}
stdout { codec => json_lines }
}
yes this is the query
SELECT COUNT(DRG.[Id] )
FROM [Drug] AS [DRG]
LEFT OUTER JOIN [dbo].[DrugForm] AS [DRGFROM] ON [DRG].[DrugFormId] = [DRGFROM].[Id]
LEFT OUTER JOIN [dbo].[DrugUnit] AS [DRGUNIT] ON [DRG].[DrugUnitId] = [DRGUNIT].[Id]
LEFT OUTER JOIN [dbo].[DrugType] AS [DRGTYPE] ON [DRG].[DrugTypeId] = [DRGTYPE].[Id]
LEFT OUTER JOIN [DBO].[Drug_DrugCat] as [DRGCATEGORY] on [DRG].[ID] = [DRGCATEGORY].[DrugId]
LEFT OUTER JOIN [DBO].[DrugCat] as [DRGCAT] on [DRGCATEGORY].[DrugCatId] = [DRGCAT].[Id]
yes this is the query
SELECT COUNT(DRG.[Id] )
FROM [Drug] AS [DRG]
LEFT OUTER JOIN [dbo].[DrugForm] AS [DRGFROM] ON [DRG].[DrugFormId] = [DRGFROM].[Id]
LEFT OUTER JOIN [dbo].[DrugUnit] AS [DRGUNIT] ON [DRG].[DrugUnitId] = [DRGUNIT].[Id]
LEFT OUTER JOIN [dbo].[DrugType] AS [DRGTYPE] ON [DRG].[DrugTypeId] = [DRGTYPE].[Id]
LEFT OUTER JOIN [DBO].[Drug_DrugCat] as [DRGCATEGORY] on [DRG].[ID] = [DRGCATEGORY].[DrugId]
LEFT OUTER JOIN [DBO].[DrugCat] as [DRGCAT] on [DRGCATEGORY].[DrugCatId] = [DRGCAT].[Id]
That was not what I asked. Please look at a few documents and check what the value _version is.
If the id field is unique across all records, this value should never really be greater than 1, as a higher value indicates that documents have been updated.
yes this is the query
SELECT COUNT(DRG.[Id] )
FROM [Drug] AS [DRG]
LEFT OUTER JOIN [dbo].[DrugForm] AS [DRGFROM] ON [DRG].[DrugFormId] = [DRGFROM].[Id]
LEFT OUTER JOIN [dbo].[DrugUnit] AS [DRGUNIT] ON [DRG].[DrugUnitId] = [DRGUNIT].[Id]
LEFT OUTER JOIN [dbo].[DrugType] AS [DRGTYPE] ON [DRG].[DrugTypeId] = [DRGTYPE].[Id]
LEFT OUTER JOIN [DBO].[Drug_DrugCat] as [DRGCATEGORY] on [DRG].[ID] = [DRGCATEGORY].[DrugId]
LEFT OUTER JOIN [DBO].[DrugCat] as [DRGCAT] on [DRGCATEGORY].[DrugCatId] = [DRGCAT].[Id]
and the result count is 23026
No, that only counts the number of rows. To count the number of unique DRG.[Id] values you need:
SELECT DISTINCT(COUNT(DRG.[Id]) )
FROM [Drug] AS [DRG]
LEFT OUTER JOIN [dbo].[DrugForm] AS [DRGFROM] ON [DRG].[DrugFormId] = [DRGFROM].[Id]
LEFT OUTER JOIN [dbo].[DrugUnit] AS [DRGUNIT] ON [DRG].[DrugUnitId] = [DRGUNIT].[Id]
LEFT OUTER JOIN [dbo].[DrugType] AS [DRGTYPE] ON [DRG].[DrugTypeId] = [DRGTYPE].[Id]
LEFT OUTER JOIN [DBO].[Drug_DrugCat] as [DRGCATEGORY] on [DRG].[ID] = [DRGCATEGORY].[DrugId]
LEFT OUTER JOIN [DBO].[DrugCat] as [DRGCAT] on [DRGCATEGORY].[DrugCatId] = [DRGCAT].[Id]
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.