You can use the ingest attachment plugin.
There an example here: https://www.elastic.co/guide/en/elasticsearch/plugins/current/using-ingest-attachment.html
PUT _ingest/pipeline/attachment
{
"description" : "Extract attachment information",
"processors" : [
{
"attachment" : {
"field" : "data"
}
}
]
}
PUT my_index/_doc/my_id?pipeline=attachment
{
"data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0="
}
GET my_index/_doc/my_id
The data
field is basically the BASE64 representation of your binary file.
You can use FSCrawler. There's a tutorial to help you getting started.