I have updated from ES 2.3 to 5, and in the docs has how to ingest data into ES, but when using processors, with Ingest Attachment, I could not figure it out how should be done.
I have created my Pipeline with success as of below:
$params = [
'id' => 'attachment',
'body' => [
'description' => 'Extract attachment information',
'processors' => [
[
'attachment' => [
'field' => 'content',
'indexed_chars' => -1
]
]
]
]
];
return $client->ingest()->putPipeline($params);
I tried to index my pdf file with:
$params = [
'index' => 'index',
'type' => 'type',
'id' => 'document_id',
'body' => [
'content' => base64_encode(file_get_contents($fullfile))
]
];
return $client->index($params);
or with:
return $client->ingest()->putPipeline($params);
With no success...
Using regular json (with postman) the code below works smoothly:
PUT /index/type/my_indexed_id?pipeline=attachment
{
"content" : "BuDQowMDAwNDgyMzA0I.....MY_WHOLE_ENCODED_PDF_FILE"
}
So, how do we inform out $client that we must use an specific pipeline?
As of used above with json: PUT /index/type/my_indexed_id?pipeline=attachment
Thanks!