Attachments on ELK

Hello,

I have some attachments which need to be ingested into elastic.
There are of varying types i.e PDF, TXT, CSV, XLS, DOC, MSG (Outlook emails).

As far as I could find out , I got to know that I can use Ingest Attachment Processor Plugin to extract the content of my files into elastic.
There are multiple forums that are covering off most of file types. The Outlook emails are the majority of the documents we have - would the ingest processor be able to manage this ?

Also, just for my knowledge, the above approach is about reading contents of file and ingesting them onto elastic. The API search would just return the extracted contents.
Is there a way the file gets loaded into elastic as is without extraction of data from it, I can have an index on the fileName and retrieve the whole file. If Yes, how would the end user view the file.

Thanks.

Hi @AChopra

Have you looked at Elastic Workplace Search?

Here is a quick Video

Hi @stephenb,

This looks great .. But then our use case is much more than just accessing / retrieving documents.

So we have an Oracle 11g instance with around 4-5 TB of data (out of which some are documents). This has historical data which we want to access.

The initial plan was leverage the ELK stack , use logstash and get data into elasticsearch . We have a java application which will then connect to elastic via API and retrieve the results.

With the above approach - we had some reservations around handling attachments.
The workplace search looks great - we could move the documents to sharepoint and use workplace search but then we would have to use workplace search for attachments and then for other historical data look at the java application leveraging results off ELK

I jumped the gun ... The video has the answer.
We are anyways looking at taking a platinum license for ELK and that will automatically have the workspace for free.

I can then manage this under 1 license.
All I would need is to ensure that the attachments / documents are placed in the available out of the box content source.

Thanks a lot.
The workplace search offering seems to be a pretty good one for end users who are non technical.

1 Like

I marked @stephenb 's answer as the solution.

I'd just like to add the answer to the initial question

No. Not every type of documents. The ingest attachment plugin does not support all content files. I'm unsure about Outlook email files. FSCrawler should supports those files though. And it has a Work In Progress branch which will connect the local files to Workplace Search.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.