I have a Filebeat pipeline that is ingesting data from an end-user machine that might be stored there for 30 days. My Logstash pipeline has the following settings:
Because I wanted to make sure that the same log is never written into the database twice I set the action to create and I created my unique document_id. That setup works fine for me but I had a situation recently where Filebeat was uninstalled and the registry folder for it was wiped clean on that machine. Now, when we got it re-installed it tries to re-ingest 30 days worth of logs and write them again.
That results in errors because the action "create" doesn't allow for overrides. I can change that to "update" but another issue that pops up is that some of the indexes that this tries to update are already in the Warm tier, and are not flagged as "write" indexes.
Any idea how I should handle this situation? Logstash is returning a lot of errors because "create" fails to deal with existing documents, and that slows it down to a grind.
I was thinking that could change the action to "update" and add the doc_as_upsert to true to make sure that I can override existing documents. Then I would need to make older indexes somehow writable again. Should I reindex that older data into the Hot tier for now, and then after logstash overrides all of the documents, I can just move it back to warm again? Does that sound like a reasonable thing to do? Any other ways to deal with this issue?
Just saw this, the performance impact could be related to the amount of logs being written, I had a past issue related to this.
One quick solution would be to simple change the Logstash loglevel for the Elasticsearch output, to only logs on EROR or FATAL logs for example, not sure in which level the current create errors are logged.
If these are spread across many files you can set ignore_older to the number of days ago the registry was reset
If the file is older than ignore_older, Filebeat will add the file to its registry with the offset set to the end of the file and then you can simply revert back or remove the ignore older setting.
Similarly, you could just add a processor to Filebeat or to Logstash that drops events older than the last message processed from the device prior to the registry reset.
Obviously I stand corrected, thank you. On reflection, I have deleted the above comment as a) it was not helpful and b) it was also factually wrong. Apologies.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.