Hi,
I'm trying to update existing fields with new data, and I'm not clear upsert or update API is the way to go (and a little confused on how exactly to implement those, for my use case)
Every week, I run the following: logstash -f C:\ELK\LOGCONF\Statistics.conf
The Statistics.conf file points to a .csv file, populated onto a local path
path => ["C:\cygwin64\home\Admin\Statistics.csv"]
I am using the Date filter in Logstash, since my data changes week to week, and I simply want to have the following weeks values to "update" and fill in the prior entries.
In other words:
On September 17th (call it Week 1), I run the logstash -f C:\ELK\LOGCONF\Statistics.conf
Some values are populated from the .csv, exampled here in the following 4 columns.
Date strong textScore Conditions Week
Sep-17 10 Good Week-1
Sep-24
Oct-1
On September 24th (call it Week 2),
I run the logstash -f C:\ELK\LOGCONF\Statistics.conf again.
This is what I would like to see below. Basically wanting to just add data/values for prior established column fields extracted from the .csv.
Date Score Conditions Week
Sep-17 10 Good Week-1
Sep-24 20 Better Week-2 (add new data pulled from .csv)
and on this date, again poll the .csv to add the new values
Oct-1 30 Excellent Week-3
Another point to note is that my Elasticsearch is temporarily mobile, so I running Ctrl-C on each weekly logstash -f operation, only to resume the following week with the new data, as the new "weekly-set" of statistics are generated and the .csv is made available.
Again, in this example, on October 1st, I would run the logstash -f command again, and hope to continue incrementally adding data as I go. The problem I am trying to solve, is that currently, when I run the import of a new weeks worth of data, I am getting duplicate documents seen in the Discover tab, irrespective of using the Date or timestamp.
In other words, I get the new data, but I also still have the old entries, so I am left with duplicates everywhere.
Is there a way this can be done, for this use case? Basically, I am wanting to poll a .csv to grab new data into prior established column fields.