I have a pipeline set up that I wish to "refresh" a CSV file (to be used later in a "translate" step) from a database.
What I'm seeing is that every time this pipeline runs, the entire CSV output gets appended to the named file.
From reading the Ruby code, the CSV output is derived from the File output, and thus it is possible (but undocumented) to pass the "write_behavior" option to the CSV output However, the "write_behavior" setting appears to have exactly two options:
- "append" => append every event to the file (the file will keep growing)
- "overwrite" => overwrite the file each time an event is processed (the file will always contain at most one line)
What I need is for each time the pipeline runs, the CSV file will be replaced with the most recent data obtained from a JDBC source.
If you are wondering why I don't just use jdbc_static, I would like to. I am currently trying to figure out why the jdbc_static plugin seems unable to process more than about 200 events per second. I need a single Logstash instance to be able to process about 7500 events per second.