Logstash parsing for rich text documents


Can Logstash parse rich text documents? I have a use-case where the input is not logs, but some documents instead and then build some KPI in Kibana. I have 2 questions:

  1. Can logstash do it with some plugin or I would have to create a custom one?
  2. Instead of using Logstash, should I convert the document into JSON first and then pass it to ES directly?


Assuming "rich text" means RTF:

  1. You'd need a custom plugin.
  2. That might be easier, yes. It's not obvious that Logstash brings any significant benefits in this case.

@magnusbaeck : Found a solution to this problem. Using mapper-attachment plugin, any rich text documents, PDF or office documents can be indexed in Elasticsearch. For making it real time, fs-crawler plugin can be used.