Implementing Ingest Attachment Processor Plugin

Hi @dadoonet,

Thanks for your time and attention.

You were right, even after a lot of reading I was missing some important points. Here are the step by step I could walk through, to get it done:

  1. Install ingest-attachment:

             ./bin/elasticsearch-plugin install ingest-attachment
    
  2. Create my pipeline:

         //Post to /_ingest/pipeline/attachment
         {
            "description" : "Extract attachment information",
            "processors" : [
            {
                "attachment" : {
                "field" : "data"
             }
          }]
         }
    
  3. Map my index without my content filed, which I called data on previous item (pipeline):

     // Using PHP Client I map my index"
         $this->params = [
             'index' => $this->index,
             'type'  => $this->type,
             'body'  => [
                 $this->type => [
                   'properties'    => [
                         'id' => [
                             'type' => 'integer'
                         ],
                         'name' => [
                             'type' => 'string'
                         ],
                         'description' => [
                             'type' => 'string'
                         ],
                         'type' => [
                             'type' => 'string'
                         ],
                         'author' => [
                             'type' => 'string'
                         ],
                         'editor' => [
                             'type' => 'string'
                         ]
                    ]
    
  4. Index some text to my index, without my pdf file, such as file name, type, author etc.

  5. Then, we index the file, as of below:

     // PUT /index/type/my_indexed_id?pipeline=attachment
     {
       "data": "base64_encode('file.pdf')"
     }
    
  6. I got the file indexed... but still I could not get it searched... it seems it is not decoding when gets to elastic ingest....

Could you give us some tip on this issue?

I think we are getting there!

Cheers!