I have a big log CSV file, receiving logs from different equipments.
My logstash read the lines and send to a daily index.
I need to keep the last equipment report in a different index.
Too many slow visualizations aggregate top hit to get this value, but I think that I don't need to do this.
My plan is:
import logs normally as today with logstash-*
create a second index 'last_report', where id is equipment_id.
So, every new line actually will be a update on the last_report index.
Is this possible with logstash, or did I need to think in a different way?
That is a quite common method to make sure the latest state can be retrieved efficiently. It should be fine doing that with Logstash, at least as long as you do not have documents being updated very frequently.
But my problem is, one of them, I have a generated id (default), the other one, must be equipment_id, so next lines will just update old documents or insert a new one.
That is not very frequent so should not be a problem. I would recommend creating two elasticsearch output plugins, one to write the documents with auto-generated id into the existing index and one to write it using equipment_id to the new index. This will result in an insert the first time and an update every time after that.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.