Updating existing fields with new data pulled from .csv

Hi,

I'm trying to update existing fields with new data, and I'm not clear upsert or update API is the way to go (and a little confused on how exactly to implement those, for my use case)

Every week, I run the following: logstash -f C:\ELK\LOGCONF\Statistics.conf

The Statistics.conf file points to a .csv file, populated onto a local path
path => ["C:\cygwin64\home\Admin\Statistics.csv"]

I am using the Date filter in Logstash, since my data changes week to week, and I simply want to have the following weeks values to "update" and fill in the prior entries.

In other words:

On September 17th (call it Week 1), I run the logstash -f C:\ELK\LOGCONF\Statistics.conf
Some values are populated from the .csv, exampled here in the following 4 columns.

Date strong textScore Conditions Week

Sep-17 10 Good Week-1
Sep-24
Oct-1

On September 24th (call it Week 2),
I run the logstash -f C:\ELK\LOGCONF\Statistics.conf again.

This is what I would like to see below. Basically wanting to just add data/values for prior established column fields extracted from the .csv.

Date Score Conditions Week

Sep-17 10 Good Week-1
Sep-24 20 Better Week-2 (add new data pulled from .csv)

and on this date, again poll the .csv to add the new values

Oct-1 30 Excellent Week-3

Another point to note is that my Elasticsearch is temporarily mobile, so I running Ctrl-C on each weekly logstash -f operation, only to resume the following week with the new data, as the new "weekly-set" of statistics are generated and the .csv is made available.

Again, in this example, on October 1st, I would run the logstash -f command again, and hope to continue incrementally adding data as I go. The problem I am trying to solve, is that currently, when I run the import of a new weeks worth of data, I am getting duplicate documents seen in the Discover tab, irrespective of using the Date or timestamp.

In other words, I get the new data, but I also still have the old entries, so I am left with duplicates everywhere.

Is there a way this can be done, for this use case? Basically, I am wanting to poll a .csv to grab new data into prior established column fields.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.