Hello All,
I have a logstash configuration that uses the following in the output block in an attempt to mitigate duplicates in elastic Index.
The data that logstash is fetching from is through perl script running every "X" minutes in server..Ex: every 10 minutes script runs.
This works when logstash sees the same doc in the same index, but since the command that generates the input data doesn't
have a reliable rate at which different documents appear,logstash will sometimes insert duplicates docs in a different day wise index(rollover).
However, when the day date stamp rolls over, and the document still appears, elastic/logstash thinks it's a new doc.
Ideally I need only one document entry in "all" day wise indices with its unique id,Reason: Suppose if there is any update in this document then
this document id should get updtaed and not create new document with same id.
Can this be achieved?...Any suggestion would be helpful.
input {
exec {
command => '. ../scripts/viewvolumes/run_viewvolumes.sh'
schedule => "0 */10 * * * *"
}
}
filter {
if [message] =~ "^\{.*\}[\s\S]*$" {
json {
source => "message"
target => "parsed_json"
remove_field => "message"
}
split {
field => "[parsed_json][viewVolumesResponse]"
target => "volume"
remove_field => [ "parsed_json" ]
}
}
else {
drop { }
}
}
output {
elasticsearch {
hosts => "http://abc09appl008.dev.dm01.group.arg:9200"
ilm_pattern => "{now/d}-000001"
ilm_rollover_alias => "tis-monitor-viewvolumes"
ilm_policy => "tis-monitor-viewvolumes-policy"
doc_as_upsert => true
document_id => "%{[volume][volumeName]}"
}
}
@ChinigamiHunter Can you help in this?
Thanx,