No handler for type [attachment] declared on field [my_attachment]


(Süleyman Yalman) #1

Hi to all.

when l run this in postman :

{
"person" : {
"properties" : {
"my_attachment" : { "type" : "attachment" }
}
}
}

l got this respond :

{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "No handler for type [attachment] declared on field [my_attachment]"
}
],
"type": "mapper_parsing_exception",
"reason": "No handler for type [attachment] declared on field [my_attachment]"
},
"status": 400
}

l already installed "ingest-attachment" plugin....

l am expecting your valuable helps.


(David Pilato) #2

Ingest attachment does not work like this (like removed mapper-attachments plugin).

Read https://www.elastic.co/guide/en/elasticsearch/plugins/current/ingest-attachment.html


(Süleyman Yalman) #3

Hi David . Before l created a topic about how to index pdf file and then how to search.
For this goal. Do l have to use Apache Tika. if you say yes then how can l integrate TIKA with elasticsearch. And also for this goal, do l have to create pipiline. please give an example.
l need simple example

critical note : l do not know MAVEN

thanks for your helps


(David Pilato) #4

What you don't understand in the documentation I linked to?


(Süleyman Yalman) #5

how we can get with a base64-encoded attachment

POST /trying-out-mapper-attachments/person/1
{
"cv": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0="
}

l meant how do you know that base64-encoded of content of "cv" field is ;

"cv": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0="

which tool or program gives us this result.....


(David Pilato) #6

It depends on your programming language.

How do you call elasticsearch?

You can also look at FSCrawler project which allows you to upload a binary file directly to its REST API.


(Süleyman Yalman) #7

Thanks David.
Let me before tell you what l want to do.
My aim is searching and listing physically stored PDF or TXT files .
For example in many of my PDF files include a word , lets say "car".
Whenever l search this PDF files by a word "car", l want to be listed all PDF files those have "car" word as content..

For this aim step by step what should l learn or search.

Till this time l learnt how to index any JSON.
l learnt that for this aim have to be installed ingest plugin.
And installing this plugin we are adding new variable type which is "attachment". Is it right?
Also to be indexed any PDF or TXT file we have to encode this files to BASE64.

l testing all steps by using Postman which is extension of chrome as you know..
l specially telling this because request you giving examles whic is applicable to Postman.
Also using "https://www.giftofspeed.com/base64-encoder/" web page to encode PDF or TXT files.

Yes till here l get some point. But after this point what to do please guide me.

l have same question
-To be searched any PDF or TXT files before these files have to be indexed or this is unnecessary

  • Do we have to describe pipeline. for what aim we describe pipeline specially in this scenario.
  • if we have to descibe pipeline before do we have to describe processor. if yes . Why and how

l am expecting your help
Thanks


(David Pilato) #8

No. This is wrong. Ingest attachment plugin does not add a new type.
It adds an ingest processor as described in documentation.

Also to be indexed any PDF or TXT file we have to encode this files to BASE64.

Yes. This is true for ingest attachment plugin.

To be searched any PDF or TXT files before these files have to be indexed or this is unnecessary

Yes this is mandatory. No index = no search.

Do we have to describe pipeline. for what aim we describe pipeline specially in this scenario.

The ingest attachment processor will read the BASE64 content and will extract the text out of it. This text will be then indexed by elasticsearch.

if we have to descibe pipeline before do we have to describe processor.

Yes. A pipeline is just a set of processors. You need to add the processor in your pipeline.

How?

Again, read the documentation I linked to.


(Süleyman Yalman) #9

Hi David
In documentation there is an expression. which is below

The ingest attachment plugin lets Elasticsearch extract file attachments in common formats (such as PPT, XLS, and PDF) by using the Apache text extraction library Tika.

as you saw in bolded texts are not understood by me. What does "extract file attachments"
means


(David Pilato) #10

If you have a PDF documents which contain foo bar then this text is extracted by the plugin using Tika. And this is then what is indexed by elasticsearch.


(Süleyman Yalman) #11

l am using Postman as you know which is extension of chrome . By using Postman how can extract file attachments with TIKA .
Is there any applicable example for Postman which you can refer to me.
Or any web address


(David Pilato) #12

I'm not using Postman.

But I guess you should be able to easily call something like:

POST index/_doc?pipeline=thepipelineyoudefined
{
  "file": "BASE64 CONTENT HERE"
}

(Süleyman Yalman) #13

what is below code fragment doing . which l get from " https://www.elastic.co/guide/en/elasticsearch/plugins/current/using-ingest-attachment.html#using-ingest-attachment"

POST index/_doc?pipeline=thepipelineyoudefined
{
"file": "BASE64 CONTENT HERE"
}

is it same as below?

PUT my_index/my_type/my_id?pipeline=attachment
{
"data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0="
}


(David Pilato) #14

Yes. Similar.


(system) #15

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.