Building ETL form ES to csv, last run date needed


(Martin Cover 42 Group GmbH) #1

Hello,

im using logstash to extract data from ES to store it in csv (finally it should end up in a s3 bucket, but the output part is not so important here). I set up the following configuration:

input {
elasticsearch {
hosts => "localhost:9200"
index => "trip-logs"
docinfo => true
query => ' {
"_source" : ["asset","recorded_at_ms", "fields.BATT"],
"sort" : [
{ "recorded_at" : {"order" : "desc"}}
],
"size":10000,
"query" : {
"constant_score" : {
"filter" : {
"terms" : {
"fields.BATT" : [0, 1]
}
}
}
}
}'
}
}
output {
csv {
fields => ["asset","recorded_at_ms","[fields][BATT]"]
path => "C:/Elasticsearch/output.csv"
}
}

My problem is, im really missing a last run date. I want to include it in the query to get just the new results. What i try to achieve is a more or less classic etl.

So far i could figure out that, e.g., the jdbc input and the file input plugin have a kind of last run date, but not the es input plugin.

Could anyone give me a solution or hint how to solve this problem?

Best

Martin


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.