Using ingest-attachment plugin


(Bernhard Gwosdz) #1

I am new to ES and LS trying to setup a pipeline that ingests pdf documents from a directory into ES using the ingest-attachment plugin. Do I have to extract the file's content myself or will it be done automatically by ES?


(evert) #2

You should pass the file base64 encoded, please read these discussions which will claryfy and give sample codes as well:

  1. https://discuss.elastic.co/t/implementing-ingest-attachment-processor-plugin/

  2. https://discuss.elastic.co/t/elasticsearch-5-0-php-ingest-attachment/

And also the docs which will help you a lot.

Cheers!


(Bernhard Gwosdz) #3

Thank you very much for the links. Now I know how to design the processing


(David Pilato) #4

You can also look at: https://github.com/dadoonet/fscrawler


#5

I try to execute.
[INFO] BUILD FAILURE (Unknown lifecycle phase "build")
Is-there a bin directory to help me ?


(David Pilato) #6

What do you try to execute?


#7

I try to "execute" your stackoverflow post :


#8

The command to index a PDF and the command to search into a PDF after indexing.


#9

I tried fscrawler more "seriously" and I succed to create my index.
ThankU.


(evert) #10

Did you manage to fix it or still need help?


(Bernhard Gwosdz) #11

Thank you for you help.

make content happen – Radio Innovation Camp 2016http://www.radioinnovationcamp.com/ - Wir sind mit dabei! 11. & 12. November in Erfurt
Kennen Sie den MBS Newsletter? Hier können Sie ihn abonnieren MBS-Newsletterhttp://www.scisys.de/de/wo-wir-arbeiten/media-broadcast/mbs-newsletter.html

Bernhard Gwosdz
Softwareingenieur / Systemarchitekt
Media Broadcasting Solutions
SCISYS Deutschland GmbH
T: +49 234 9258298 | F: +49 234 9258190
E: bernhard.gwosdz@scisys.demailto:bernhard.gwosdz@scisys.de | http://www.media.scisys.de

SCISYS Deutschland GmbH, Borgmannstraße 2, 44894 Bochum, Germany
Geschäftsf.: Prof. Dr.-Ing. Klaus-G. Meng (Vors.), Sandra Krewerth, Dr.-Ing. Karl-W. Pieper, Dr.-Ing. Horst Wulf
Amtsgericht Bochum HRB 13694, Ust.-IdNr. DE 813242674, WEEE-Reg.-Nr. DE 74530735


(system) #12

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.