Mapper-attachment plugin for 6.x

Upfront: I know about "mapper-attachments" but the plugin does not seem to be compatible with ES6, only with older versions.

Detailed problem:

Hi, I'm new to ElasticSearch and Logstash, coming from Solr. I am looking something to index emails from IMAP including their attachments, where I can index an existing IMAP account (not only new, unread mails that arrive).

Unfortunately I find the documentation not that complete, for example the logstash imap plugin page does not say what exactly "strip_attachments" does, nor does it make any difference what I set as value.

https://www.elastic.co/guide/en/logstash/current/plugins-inputs-imap.html#plugins-inputs-imap-strip_attachments

I am especially confused about this option, since according to the open issue, it seems that attachments aren't actually paid attention to anyway.

I followed the tutorial on https://qbox.io/blog/indexing-emails-to-elasticsearch-logstash-imap

I can get the general indexing to work, but without attachments and without being able to index the entire IMAP account.

I also looked at the tutorial at

but this refers to an old ElasticSearch 2.4.2 version and is outdated, doesn't work for ES. When I try to install the mentioned "mapper-attachments" plugin in my current ES6, I get the message "ERROR: Unknown plugin mapper-attachments"

Looking at the github repo of this plugin (https://github.com/elastic/elasticsearch-mapper-attachments), I see that it's only compatible with ES versions < 6

I am actually looking something more complete and working with ES6, similar to the Solr equivalent:
https://lucene.apache.org/solr/4_5_0/solr-dataimporthandler-extras/org/apache/solr/handler/dataimport/MailEntityProcessor.html
where you can:

  • define the timestamp from which point on you want to index (including older mails)
  • index attachments

Of course, this can be coded in your own plugin or some script that reads all mails and pushes them to ElasticSearch, but I would think something alike must exist for ElasticSearch already, being on the market for so long. Maybe I just didn't find it yet.

I already found other related Q&A on StackOverflow:


but this uses a totally different tool for reading the IMAP account, not what I had in mind.

My configuration:

#email_log.conf
input {
imap {
    host => "mail.mailserver.com"
    password => "secret"
    user => "myuser"
    port => 993
    check_interval => 30
    folder => "Inbox"
    strip_attachments => false
    }
}
output {
	stdout { codec => rubydebug }
	elasticsearch {
		index => "emails"
		document_type => "email"
		hosts => "localhost:9200"
    }
}

Better to answer your own question than trying to remove the post so people with the same question will get an answer.

In that case, have a look at ingest attachment plugin.
Or FSCRAWLER project.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.