How do I ingest archived data (zip, tar.gz, etc)?

manbearpig · January 16, 2017, 9:01am

I am working on an archiving product and we have been looking to replace our indexing/search tool with Elasticsearch.

Our current process is roughly like this:

Processing Node

Incoming Data is processed
Data is archived as .zip or .tar.gz
Data is uploaded to a shared folder

Ingesting Node

Our tool scan the shared folder on an interval
grab the new archive and ingest it to our indexing tool
Delete the archive from the shared folder

I have tried looking into Beats and Logstash and also the new "ingest node" feature in Elasticsearch 5.0. So far I did not come across a natural solution to this.

Is this something that is doable without much work OOTB ?

I am not familiar with Ruby and it seems that most of the plugins/filters are writtern in it.

Topic		Replies	Views
Ingest node or logstash Elasticsearch	5	456	October 20, 2019
Elasticsearch Backup/Archive Option Elasticsearch	11	626	July 6, 2017
Extracting logs form archive before ingestion to elasticsearch Elasticsearch	0	95	April 10, 2024
ElasticSearch for the Log Search in Zipped Archives Usecase Elasticsearch	4	1308	November 4, 2022
Indexing Email Archive file Elasticsearch	3	2006	January 2, 2017

How do I ingest archived data (zip, tar.gz, etc)?

Related topics