How do I ingest archived data (zip, tar.gz, etc)?


#1

I am working on an archiving product and we have been looking to replace our indexing/search tool with Elasticsearch.

Our current process is roughly like this:

Processing Node

  1. Incoming Data is processed
  2. Data is archived as .zip or .tar.gz
  3. Data is uploaded to a shared folder

Ingesting Node

  1. Our tool scan the shared folder on an interval
  2. grab the new archive and ingest it to our indexing tool
  3. Delete the archive from the shared folder

I have tried looking into Beats and Logstash and also the new "ingest node" feature in Elasticsearch 5.0. So far I did not come across a natural solution to this.

Is this something that is doable without much work OOTB ?

I am not familiar with Ruby and it seems that most of the plugins/filters are writtern in it.


(system) #2