{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "No handler for type [attachment] declared on field [my_attachment]"
}
],
"type": "mapper_parsing_exception",
"reason": "No handler for type [attachment] declared on field [my_attachment]"
},
"status": 400
}
l already installed "ingest-attachment" plugin....
Hi David . Before l created a topic about how to index pdf file and then how to search.
For this goal. Do l have to use Apache Tika. if you say yes then how can l integrate TIKA with elasticsearch. And also for this goal, do l have to create pipiline. please give an example.
l need simple example
Thanks David.
Let me before tell you what l want to do.
My aim is searching and listing physically stored PDF or TXT files .
For example in many of my PDF files include a word , lets say "car".
Whenever l search this PDF files by a word "car", l want to be listed all PDF files those have "car" word as content..
For this aim step by step what should l learn or search.
Till this time l learnt how to index any JSON.
l learnt that for this aim have to be installed ingest plugin.
And installing this plugin we are adding new variable type which is "attachment". Is it right?
Also to be indexed any PDF or TXT file we have to encode this files to BASE64.
l testing all steps by using Postman which is extension of chrome as you know..
l specially telling this because request you giving examles whic is applicable to Postman.
Also using "https://www.giftofspeed.com/base64-encoder/" web page to encode PDF or TXT files.
Yes till here l get some point. But after this point what to do please guide me.
l have same question
-To be searched any PDF or TXT files before these files have to be indexed or this is unnecessary
Do we have to describe pipeline. for what aim we describe pipeline specially in this scenario.
if we have to descibe pipeline before do we have to describe processor. if yes . Why and how
Hi David
In documentation there is an expression. which is below
The ingest attachment plugin lets Elasticsearch extract file attachments in common formats (such as PPT, XLS, and PDF) by using the Apache text extraction library Tika.
as you saw in bolded texts are not understood by me. What does "extract file attachments"
means
If you have a PDF documents which contain foo bar then this text is extracted by the plugin using Tika. And this is then what is indexed by elasticsearch.
l am using Postman as you know which is extension of chrome . By using Postman how can extract file attachments with TIKA .
Is there any applicable example for Postman which you can refer to me.
Or any web address
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.