Hi Team,
I am using logstash http filter to get pdf from url and extract it.
http filter has downloaded pdf and extracted its content on target_field. But the contents are not proper and also its not base64 encoded. How I can handle this encoding at logstash, so that I can use ingest pipeline at elasticsearch to index PDF's
Is there any way, I can create base64 formatted string in logstash? That way I can convert extracted contents from UTF-8 to base64 and then used attachment preprocessor.
@Badger,
The input source is database, has one field as url, I am using logstash http filter to get data from those urls, urls has pdf, word, text files.
files size is big as upto 6MB.
Is it possible to encode whole content extracted by http filter into base64 format and then use attachment processor on that encoded data?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.