Mapper attachment plugin fails to index document


#1

Hello,

I am using mapper attachment plugin to index pdf documents. I followed both official documentation : https://github.com/elastic/elasticsearch-mapper-attachments and this example tutorial : http://www.elasticsearch.cn/tutorials/2011/07/18/attachment-type-in-action.html but although mapping is created indexing fails.

My mapping looks like this: curl -XGET "localhost:9200/help/_mappings?pretty"
{
"help" : {
"mappings" : {
"attachment" : {
"properties" : {
"file" : {
"type" : "attachment",
"path" : "full",
"fields" : {
"file" : {
"type" : "string",
"store" : true,
"term_vector" : "with_positions_offsets"
},
"author" : {
"type" : "string"
},
"title" : {
"type" : "string",
"store" : true
},
"name" : {
"type" : "string"
},
"date" : {
"type" : "date",
"format" : "dateOptionalTime"
},
"keywords" : {
"type" : "string"
},
"content_type" : {
"type" : "string",
"store" : true
},
"content_length" : {
"type" : "integer"
},
"language" : {
"type" : "string"
}
}
}
}
}
}
}
}

The data is converted to base64 format and stored in data64. I post using this command which fails:
curl -X POST "localhost:9200/msm_help/attachment/" -d @data64
The error is:
{"error":"MapperParsingException[failed to parse]; nested: NoClassDefFoundError[org/apache/tika/metadata/Metadata]; ","status":400}

Without the altering the default mapping the document is indexed but does not yeild any results. When I follow the instructions it fails to index. I am not sure where I am going wrong.

Please help!


(David Pilato) #2

Could you share:

  • how you installed the plugin (exact commands please)
  • your BASE64 encoded content

Thanks


#3

Hi david, it seems to work now. I had to keep the field name as just "file" and it worked. I am not sure though.

"file" : {
"type" : "attachment",
"path" : "full",
"fields" : {
"file" : {
"type" : "string",
"store" : true,
"term_vector" : "with_positions_offsets"
},

This had to be "file" only. And I dont know why it would not take other names.


(David Pilato) #4

It has to be the same name as the attachement field name.


(system) #5