Looks like this is still an issue today, more than a year later.
Can you provide a working sample of how to use ingest-attachment with cbor?
I'm struggling with processing large files.
Looks like this is still an issue today, more than a year later.
Can you provide a working sample of how to use ingest-attachment with cbor?
I'm struggling with processing large files.
hey,
you need to find a library for your progamming language that creates CBOR, so that the whole document including all fields is sent as CBOR, not only the attachment itself.
Does that make sense?
--Alex
Thank you.
For anyone struggling out there, here's a very basic working python sample, based on this configuration:
PUT _ingest/pipeline/attachment
{
"description" : "Extract attachment information",
"processors" : [
{
"attachment" : {
"field" : "data"
}
}
]
}
Sample:
import cbor2
import requests
filename = 'some-file'
headers = {'content-type': 'application/cbor'}
with open(filename, 'rb') as f:
doc = {
'data': f.read()
}
requests.put(
'http://localhost:9200/my_index/my_type/my_id?pipeline=attachment&pretty',
data=cbor2.dumps(doc),
headers=headers
)
Be aware however that not all clients have support. The python client for example doesn't allow custom headers on the index method (https://github.com/elastic/elasticsearch-py/blob/master/elasticsearch/client/init.py#L330 )
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.