Mapper Attachment in ES

Hi All,

I am new to ES and Mapper-attachment. I am trying to index a pdf into ES.

I have tried out some codes that are available in Web but giving me following error each time:

{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"No handler for type [attachment] declared on field [content]"}],"type":"mapper_parsing_exception","reason":"Failed to parse mapping [document]: No handler for type [attachment] declared on field [content]","caused_by":{"type":"mapper_parsing_exception","reason":"No handler for type [attachment] declared on field [content]"}},"status":400}

Please let me know if I have missed anything for this.

Thanks in advance..

First guesses:

  • You did not install the plugin
  • You did not restart the nodes

Hi David,

I have installed the plugin and restarted ElasticSearch.
Any other guess please!!

Can you share your logs then?

This below error is in the logs:

[2016-06-21 06:49:07,368][DEBUG][action.admin.indices.create] [edlabs_test] [test_attachments] failed to create
MapperParsingException[Failed to parse mapping [document]: No handler for type [attachment] declared on field [content]]; nested: MapperParsingException[No handler for type [attachment] declared on field [content]];
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:381)
at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:388)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:231)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:194)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: MapperParsingException[No handler for type [attachment] declared on field [content]]
at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:308)
at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseObjectOrDocumentTypeProperties(ObjectMapper.java:223)
at org.elasticsearch.index.mapper.object.RootObjectMapper$TypeParser.parse(RootObjectMapper.java:139)
at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:140)
at org.elasticsearch.index.mapper.DocumentMapperParser.parseCompressed(DocumentMapperParser.java:121)
at org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:391)
at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:265)
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:378)

Is this what you needed?

I meant full logs from the restart.

Please format your code using </> icon.

Regret that I am unable to do that as I have to request the logs from server team and they refused to give the whole document.

Any other workaround for this?

Yes. Install the plugin and restart the nodes.

[2016-06-21 08:53:00,596][INFO ][node                     ] [edlabs_test] started
[2016-06-21 08:53:00,675][INFO ][gateway                  ] [edlabs_test] recovered [5] indices into cluster_state

[2016-06-21 08:57:54,367][INFO ][rest.suppressed          ] /test_attachments Params: {index=test_attachments}
 ElasticsearchParseException[failed to parse source for create index]; nested: JsonParseException[Unexpected end-of-input: expected close marker for OBJECT (from [Source: [B@4ede4b9d; line: 1, column: 0]) at [Source: [B@4ede4b9d; line: 21, column: 594]];
 at org.elasticsearch.action.admin.indices.create.CreateIndexRequest.source(CreateIndexRequest.java:370)
 at org.elasticsearch.rest.action.admin.indices.create.RestCreateIndexAction.handleRequest(RestCreateIndexAction.java:47)
 at org.elasticsearch.rest.BaseRestHandler.handleRequest(BaseRestHandler.java:54)
 at org.elasticsearch.rest.RestController.executeHandler(RestController.java:207)
 at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:166)
 at org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:128)
 at org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:86)
 at org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:348)
 at org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:63)
 at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
 at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 at org.elasticsearch.http.netty.pipelining.HttpPipeliningHandler.messageReceived(HttpPipeliningHandler.java:60)
 at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88)
 at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
 at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
 at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 at org.jboss.netty.handler.codec.http.HttpContentDecoder.messageReceived(HttpContentDecoder.java:108)
 at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
 at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
 at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)	at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
 at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:75)
 at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
 at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
 at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
 at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
 at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
 at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
 at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
 at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
 at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
 at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
 at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: com.fasterxml.jackson.core.JsonParseException: Unexpected end-of-input: expected close marker for OBJECT (from [Source: [B@4ede4b9d; line: 1, column: 0])
 at [Source: [B@4ede4b9d; line: 21, column: 594]
 at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1581)
 at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:533)
 at com.fasterxml.jackson.core.base.ParserMinimalBase._reportInvalidEOF(ParserMinimalBase.java:470)
 at com.fasterxml.jackson.core.base.ParserBase._handleEOF(ParserBase.java:501)
 at com.fasterxml.jackson.core.base.ParserBase._eofAsNextChar(ParserBase.java:509)
 at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._skipWSOrEnd(UTF8StreamJsonParser.java:2854)
 at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:692)
 at org.elasticsearch.common.xcontent.json.JsonXContentParser.nextToken(JsonXContentParser.java:53)
 at org.elasticsearch.common.xcontent.support.AbstractXContentParser.readMap(AbstractXContentParser.java:269)
 at org.elasticsearch.common.xcontent.support.AbstractXContentParser.readMap(AbstractXContentParser.java:245)
 at org.elasticsearch.common.xcontent.support.AbstractXContentParser.map(AbstractXContentParser.java:208)
 at org.elasticsearch.action.admin.indices.create.CreateIndexRequest.source(CreateIndexRequest.java:368)... 45 more

due to word limit, I have posted the logs in two parts..
Still getting the same error..

2 things:

  • it's not all the logs. I'm missing the beginning.
  • the error here is not the same. You have an error in your index creation

So please give all the logs. You can use gist.github.com to share them. And paste the full request you made which generates this error (if needed).

ubuntu:~$ curl -XPOST 'http://localhost:9200/test1/1' -d @LiveData.json
Warning: Couldn't read data from file "LiveData.json", this makes an empty
Warning: POST.
{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"failed to parse, document is empty"}],"type":"mapper_parsing_exception","reason":"failed to parse, document is empty"},"status":400}
[2016-06-22 06:39:51,414][DEBUG][action.index ] [edlabs_test] [test1][1], node[-LFvAqO6QxGBNn_PTCntYA], [P], v[2], s[STARTED], a[id=LP_SMOnjSgmQDBC0ffb6Gw]: Failed to execute [index {[test1][1][AVV21hx1xsC1M08YNuBH], source[_na_]}] MapperParsingException[failed to parse, document is empty] at org.elasticsearch.index.mapper.DocumentParser.innerParseDocument(DocumentParser.java:156) at org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:79) at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:304) at org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:517) at org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:508) at org.elasticsearch.action.support.replication.TransportReplicationAction.prepareIndexOperationOnPrimary(TransportReplicationAction.java:1053) at org.elasticsearch.action.support.replication.TransportReplicationAction.executeIndexRequestOnPrimary(TransportReplicationAction.java:1061) at org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:170) at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase.performOnPrimary(TransportReplicationAction.java:579) at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase$1.doRun(TransportReplicationAction.java:452) at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) [2016-06-22 06:39:51,415][INFO ][rest.suppressed ] /test1/1 Params: {index=test1, type=1} MapperParsingException[failed to parse, document is empty] at org.elasticsearch.index.mapper.DocumentParser.innerParseDocument(DocumentParser.java:156) at org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:79) at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:304) at org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:517) at org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:508) at org.elasticsearch.action.support.replication.TransportReplicationAction.prepareIndexOperationOnPrimary(TransportReplicationAction.java:1053) at org.elasticsearch.action.support.replication.TransportReplicationAction.executeIndexRequestOnPrimary(TransportReplicationAction.java:1061) at org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:170) at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase.performOnPrimary(TransportReplicationAction.java:579) at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase$1.doRun(TransportReplicationAction.java:452) at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

LiveData.json: [{"ID":"1","Frequency":"20"}]

the above code is to read json file into ES index.
I hope these are the logs you wanted as i got this only.

I will post the logs and details of pdf indexing attempt too..

Thanks!!

Next i executed these:
curl -X DELETE http://localhost:9200/test_attachments
{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"test_attachments","index":"test_attachments"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"test_attachments","index":"test_attachments"},"status":404}ubuntu@ip-172-31-37-194:~$ curl -X POST http://localhost:9200/test_attachments -d '{

"mappings" : {
"document" : {
"properties" : {
"content" : {
"type" : "attachment",
"fields" : {
"content" : { "store" : "yes" },
"author" : { "store" : "yes" },
"title" : { "store" : "yes", "analyzer" : "english"},
"date" : { "store" : "yes" },
"keywords" : { "store" : "yes", "analyzer" : "keyword" },
"_name" : { "store" : "yes" },
"_content_type" : { "store" : "yes" }
}
}
}
}
}
}'
{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"Mapping definition for [fields] has unsupported parameters: [_name : {}] [_content_type : {}]"}],"type":"mapper_parsing_exception","reason":"Failed to parse mapping [document]: Mapping definition for [fields] has unsupported parameters: [_name : {}] [_content_type : {}]","caused_by":{"type":"mapper_parsing_exception","reason":"Mapping definition for [fields] has unsupported parameters: [_name : {}] [_content_type : {}]"}},"status":400}

[2016-06-22 06:51:36,780][INFO ][rest.suppressed ] /test_attachments Params: {index=test_attachments} [test_attachments] IndexNotFoundException[no such index] [2016-06-22 06:51:58,835][DEBUG][action.admin.indices.create] [edlabs_test] [test_attachments] failed to create MapperParsingException[Failed to parse mapping [document]: Mapping definition for [fields] has unsupported parameters: [_name : {}] [_content_type : {}]]; nested: MapperParsingException[Mapping definition for [fields] has unsupported parameters: [_name : {}] [_content_type : {}]]; at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:381) at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:388) at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:231) at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:194) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: MapperParsingException[Mapping definition for [fields] has unsupported parameters: [_name : {}] [_content_type : {}]] at org.elasticsearch.index.mapper.DocumentMapperParser.checkNoRemainingFields(DocumentMapperParser.java:192) at org.elasticsearch.index.mapper.DocumentMapperParser.checkNoRemainingFields(DocumentMapperParser.java:186) at org.elasticsearch.index.mapper.attachment.AttachmentMapper$TypeParser.parse(AttachmentMapper.java:373) at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:310) at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseObjectOrDocumentTypeProperties(ObjectMapper.java:223) at org.elasticsearch.index.mapper.object.RootObjectMapper$TypeParser.parse(RootObjectMapper.java:139) at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:140) at org.elasticsearch.index.mapper.DocumentMapperParser.parseCompressed(DocumentMapperParser.java:121) at org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:391) at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:265) at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:378) ... 6 more [2016-06-22 06:51:58,836][INFO ][rest.suppressed ] /test_attachments Params: {index=test_attachments} MapperParsingException[Failed to parse mapping [document]: Mapping definition for [fields] has unsupported parameters: [_name : {}] [_content_type : {}]]; nested: MapperParsingException[Mapping definition for [fields] has unsupported parameters: [_name : {}] [_content_type : {}]]; at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:381) at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:388) at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:231) at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:194) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: MapperParsingException[Mapping definition for [fields] has unsupported parameters: [_name : {}] [_content_type : {}]] at org.elasticsearch.index.mapper.DocumentMapperParser.checkNoRemainingFields(DocumentMapperParser.java:192) at org.elasticsearch.index.mapper.DocumentMapperParser.checkNoRemainingFields(DocumentMapperParser.java:186) at org.elasticsearch.index.mapper.attachment.AttachmentMapper$TypeParser.parse(AttachmentMapper.java:373) at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:310) at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseObjectOrDocumentTypeProperties(ObjectMapper.java:223) at org.elasticsearch.index.mapper.object.RootObjectMapper$TypeParser.parse(RootObjectMapper.java:139) at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:140) at org.elasticsearch.index.mapper.DocumentMapperParser.parseCompressed(DocumentMapperParser.java:121) at org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:391) at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:265) at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:378) ... 6 more

Please use a better formatting. Select the full lines before hitting the </> button.

Warning: Couldn't read data from file "LiveData.json", this makes an empty

Is obvious. You have a problem using CURL. You need to check that.

Thanks, I will look into it.

Any point for the test_attachments??

Hi David,

Just out of curiosity, we are working on
Ubuntu version : 14.04
Elastic Search: 2.1.2
Mapper- Attachment: 3.1.2
Java version: 1.8.0 _91

Is there any particular compatibility check of any type for Mapper-Attachment?

There is. But this version is fine.

Hi Tamanna,

Just let you know another alternative here. You could use Apache Tika to read, extract text from PDF and index it by yourself. That's would be around 10 lines of code. As I understand it, this ES plugin actually uses Tika internally.

Thanks,
Cody