Hi @dadoonet,
Thanks for your time and attention.
You were right, even after a lot of reading I was missing some important points. Here are the step by step I could walk through, to get it done:
-
Install ingest-attachment:
./bin/elasticsearch-plugin install ingest-attachment
-
Create my pipeline:
//Post to /_ingest/pipeline/attachment { "description" : "Extract attachment information", "processors" : [ { "attachment" : { "field" : "data" } }] }
-
Map my index without my content filed, which I called data on previous item (pipeline):
// Using PHP Client I map my index" $this->params = [ 'index' => $this->index, 'type' => $this->type, 'body' => [ $this->type => [ 'properties' => [ 'id' => [ 'type' => 'integer' ], 'name' => [ 'type' => 'string' ], 'description' => [ 'type' => 'string' ], 'type' => [ 'type' => 'string' ], 'author' => [ 'type' => 'string' ], 'editor' => [ 'type' => 'string' ] ]
-
Index some text to my index, without my pdf file, such as file name, type, author etc.
-
Then, we index the file, as of below:
// PUT /index/type/my_indexed_id?pipeline=attachment { "data": "base64_encode('file.pdf')" }
-
I got the file indexed... but still I could not get it searched... it seems it is not decoding when gets to elastic ingest....
Could you give us some tip on this issue?
I think we are getting there!
Cheers!