Howto Decode64 attachment

Hi,
I would elasticserach used for indexing text files (PDF, DOC, XML), as well
as their data storage. I used the example given on the web page for saving
the file in base64 using the plugin to store/index in to elasticsearch. It
works :slight_smile: .
When I want to get the contents of the file in its original format (eg PDF)
and get base64 string that is not properly structured (one long line). Is
there any way how to get the original file (convert back into origin MIME
type) from elasticsearch?

Thanks,
DK

--

Exactly what we do in www.scrutmydocs.org project.
It's on Github.

That said, you are probably using mapper attachment plugin, aren't you?

So, I think that there is an issue (and I tried to fix it but did not manage to get my unit tests pass): content-type is not automaticaly set, so you have to manage it on your side.
See: https://github.com/scrutmydocs/scrutmydocs/blob/master/src/main/java/org/scrutmydocs/webapp/service/document/DocumentService.java

When sending back the attachment to the user, you have to manage it again.
See https://github.com/scrutmydocs/scrutmydocs/blob/master/src/main/java/org/scrutmydocs/webapp/servlet/DownloadServlet.java

HTH

David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 7 déc. 2012 à 18:40, dukr dusan.krasa@gmail.com a écrit :

Hi,
I would elasticserach used for indexing text files (PDF, DOC, XML), as well as their data storage. I used the example given on the web page for saving the file in base64 using the plugin to store/index in to elasticsearch. It works :slight_smile: .
When I want to get the contents of the file in its original format (eg PDF) and get base64 string that is not properly structured (one long line). Is there any way how to get the original file (convert back into origin MIME type) from elasticsearch?

Thanks,
DK

--

Exactly the www.scrutmydocs.org looks like my project. :slight_smile:

Yes, I used the mapper attachment plugin for first test.

Could you please give aditional info about the parameters in

("_content_type") - it's MIME-TYPE ?
("_name") -it's filename
("content") - it's base64 encoded document ?

Thanks

dk

Dne pátek, 7. prosince 2012 19:57:41 UTC+1 David Pilato napsal(a):

Exactly what we do in www.scrutmydocs.org project.
It's on Github.

That said, you are probably using mapper attachment plugin, aren't you?

So, I think that there is an issue (and I tried to fix it but did not
manage to get my unit tests pass): content-type is not automaticaly set, so
you have to manage it on your side.
See:
https://github.com/scrutmydocs/scrutmydocs/blob/master/src/main/java/org/scrutmydocs/webapp/service/document/DocumentService.java

When sending back the attachment to the user, you have to manage it again.
See
https://github.com/scrutmydocs/scrutmydocs/blob/master/src/main/java/org/scrutmydocs/webapp/servlet/DownloadServlet.java

HTH

David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 7 déc. 2012 à 18:40, dukr <dusan...@gmail.com <javascript:>> a écrit :

Hi,
I would elasticserach used for indexing text files (PDF, DOC, XML), as
well as their data storage. I used the example given on the web page for
saving the file in base64 using the plugin to store/index in to
elasticsearch. It works :slight_smile: .
When I want to get the contents of the file in its original format (eg
PDF) and get base64 string that is not properly structured (one long
line). Is there any way how to get the original file (convert back into
origin MIME type) from elasticsearch?

Thanks,
DK

--

--

Yes for all questions.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 10 déc. 2012 à 13:31, dukr dusan.krasa@gmail.com a écrit :

Exactly the www.scrutmydocs.org looks like my project. :slight_smile:

Yes, I used the mapper attachment plugin for first test.

Could you please give aditional info about the parameters in
https://github.com/scrutmydocs/scrutmydocs/blob/master/src/main/java/org/scrutmydocs/webapp/service/document/DocumentService.java

("_content_type") - it's MIME-TYPE ?
("_name") -it's filename
("content") - it's base64 encoded document ?

Thanks

dk

Dne pátek, 7. prosince 2012 19:57:41 UTC+1 David Pilato napsal(a):
Exactly what we do in www.scrutmydocs.org project.
It's on Github.

That said, you are probably using mapper attachment plugin, aren't you?

So, I think that there is an issue (and I tried to fix it but did not manage to get my unit tests pass): content-type is not automaticaly set, so you have to manage it on your side.
See: https://github.com/scrutmydocs/scrutmydocs/blob/master/src/main/java/org/scrutmydocs/webapp/service/document/DocumentService.java

When sending back the attachment to the user, you have to manage it again.
See https://github.com/scrutmydocs/scrutmydocs/blob/master/src/main/java/org/scrutmydocs/webapp/servlet/DownloadServlet.java

HTH

David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 7 déc. 2012 à 18:40, dukr dusan...@gmail.com a écrit :

Hi,
I would elasticserach used for indexing text files (PDF, DOC, XML), as well as their data storage. I used the example given on the web page for saving the file in base64 using the plugin to store/index in to elasticsearch. It works :slight_smile: .
When I want to get the contents of the file in its original format (eg PDF) and get base64 string that is not properly structured (one long line). Is there any way how to get the original file (convert back into origin MIME type) from elasticsearch?

Thanks,
DK

--

--