I am new to ES and Mapper-attachment. I am trying to index a pdf into ES.
I have tried out some codes that are available in Web but giving me following error each time:
{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"No handler for type [attachment] declared on field [content]"}],"type":"mapper_parsing_exception","reason":"Failed to parse mapping [document]: No handler for type [attachment] declared on field [content]","caused_by":{"type":"mapper_parsing_exception","reason":"No handler for type [attachment] declared on field [content]"}},"status":400}
Please let me know if I have missed anything for this.
[2016-06-21 06:49:07,368][DEBUG][action.admin.indices.create] [edlabs_test] [test_attachments] failed to create
MapperParsingException[Failed to parse mapping [document]: No handler for type [attachment] declared on field [content]]; nested: MapperParsingException[No handler for type [attachment] declared on field [content]];
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:381)
at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:388)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:231)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:194)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: MapperParsingException[No handler for type [attachment] declared on field [content]]
at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:308)
at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseObjectOrDocumentTypeProperties(ObjectMapper.java:223)
at org.elasticsearch.index.mapper.object.RootObjectMapper$TypeParser.parse(RootObjectMapper.java:139)
at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:140)
at org.elasticsearch.index.mapper.DocumentMapperParser.parseCompressed(DocumentMapperParser.java:121)
at org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:391)
at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:265)
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:378)
[2016-06-21 08:53:00,596][INFO ][node ] [edlabs_test] started
[2016-06-21 08:53:00,675][INFO ][gateway ] [edlabs_test] recovered [5] indices into cluster_state
[2016-06-21 08:57:54,367][INFO ][rest.suppressed ] /test_attachments Params: {index=test_attachments}
ElasticsearchParseException[failed to parse source for create index]; nested: JsonParseException[Unexpected end-of-input: expected close marker for OBJECT (from [Source: [B@4ede4b9d; line: 1, column: 0]) at [Source: [B@4ede4b9d; line: 21, column: 594]];
at org.elasticsearch.action.admin.indices.create.CreateIndexRequest.source(CreateIndexRequest.java:370)
at org.elasticsearch.rest.action.admin.indices.create.RestCreateIndexAction.handleRequest(RestCreateIndexAction.java:47)
at org.elasticsearch.rest.BaseRestHandler.handleRequest(BaseRestHandler.java:54)
at org.elasticsearch.rest.RestController.executeHandler(RestController.java:207)
at org.elasticsearch.rest.RestController.dispatchRequest(RestController.java:166)
at org.elasticsearch.http.HttpServer.internalDispatchRequest(HttpServer.java:128)
at org.elasticsearch.http.HttpServer$Dispatcher.dispatchRequest(HttpServer.java:86)
at org.elasticsearch.http.netty.NettyHttpServerTransport.dispatchRequest(NettyHttpServerTransport.java:348)
at org.elasticsearch.http.netty.HttpRequestHandler.messageReceived(HttpRequestHandler.java:63)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.http.netty.pipelining.HttpPipeliningHandler.messageReceived(HttpPipeliningHandler.java:60)
at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:145)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.handler.codec.http.HttpContentDecoder.messageReceived(HttpContentDecoder.java:108)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:75)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.fasterxml.jackson.core.JsonParseException: Unexpected end-of-input: expected close marker for OBJECT (from [Source: [B@4ede4b9d; line: 1, column: 0])
at [Source: [B@4ede4b9d; line: 21, column: 594]
at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1581)
at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:533)
at com.fasterxml.jackson.core.base.ParserMinimalBase._reportInvalidEOF(ParserMinimalBase.java:470)
at com.fasterxml.jackson.core.base.ParserBase._handleEOF(ParserBase.java:501)
at com.fasterxml.jackson.core.base.ParserBase._eofAsNextChar(ParserBase.java:509)
at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._skipWSOrEnd(UTF8StreamJsonParser.java:2854)
at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:692)
at org.elasticsearch.common.xcontent.json.JsonXContentParser.nextToken(JsonXContentParser.java:53)
at org.elasticsearch.common.xcontent.support.AbstractXContentParser.readMap(AbstractXContentParser.java:269)
at org.elasticsearch.common.xcontent.support.AbstractXContentParser.readMap(AbstractXContentParser.java:245)
at org.elasticsearch.common.xcontent.support.AbstractXContentParser.map(AbstractXContentParser.java:208)
at org.elasticsearch.action.admin.indices.create.CreateIndexRequest.source(CreateIndexRequest.java:368)... 45 more
due to word limit, I have posted the logs in two parts..
Still getting the same error..
ubuntu:~$ curl -XPOST 'http://localhost:9200/test1/1' -d @LiveData.json
Warning: Couldn't read data from file "LiveData.json", this makes an empty
Warning: POST.
{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"failed to parse, document is empty"}],"type":"mapper_parsing_exception","reason":"failed to parse, document is empty"},"status":400} [2016-06-22 06:39:51,414][DEBUG][action.index ] [edlabs_test] [test1][1], node[-LFvAqO6QxGBNn_PTCntYA], [P], v[2], s[STARTED], a[id=LP_SMOnjSgmQDBC0ffb6Gw]: Failed to execute [index {[test1][1][AVV21hx1xsC1M08YNuBH], source[_na_]}] MapperParsingException[failed to parse, document is empty] at org.elasticsearch.index.mapper.DocumentParser.innerParseDocument(DocumentParser.java:156) at org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:79) at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:304) at org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:517) at org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:508) at org.elasticsearch.action.support.replication.TransportReplicationAction.prepareIndexOperationOnPrimary(TransportReplicationAction.java:1053) at org.elasticsearch.action.support.replication.TransportReplicationAction.executeIndexRequestOnPrimary(TransportReplicationAction.java:1061) at org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:170) at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase.performOnPrimary(TransportReplicationAction.java:579) at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase$1.doRun(TransportReplicationAction.java:452) at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) [2016-06-22 06:39:51,415][INFO ][rest.suppressed ] /test1/1 Params: {index=test1, type=1} MapperParsingException[failed to parse, document is empty] at org.elasticsearch.index.mapper.DocumentParser.innerParseDocument(DocumentParser.java:156) at org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:79) at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:304) at org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:517) at org.elasticsearch.index.shard.IndexShard.prepareCreate(IndexShard.java:508) at org.elasticsearch.action.support.replication.TransportReplicationAction.prepareIndexOperationOnPrimary(TransportReplicationAction.java:1053) at org.elasticsearch.action.support.replication.TransportReplicationAction.executeIndexRequestOnPrimary(TransportReplicationAction.java:1061) at org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:170) at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase.performOnPrimary(TransportReplicationAction.java:579) at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase$1.doRun(TransportReplicationAction.java:452) at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
Next i executed these:
curl -X DELETE http://localhost:9200/test_attachments
{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"test_attachments","index":"test_attachments"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"test_attachments","index":"test_attachments"},"status":404}ubuntu@ip-172-31-37-194:~$ curl -X POST http://localhost:9200/test_attachments -d '{
[2016-06-22 06:51:36,780][INFO ][rest.suppressed ] /test_attachments Params: {index=test_attachments} [test_attachments] IndexNotFoundException[no such index] [2016-06-22 06:51:58,835][DEBUG][action.admin.indices.create] [edlabs_test] [test_attachments] failed to create MapperParsingException[Failed to parse mapping [document]: Mapping definition for [fields] has unsupported parameters: [_name : {}] [_content_type : {}]]; nested: MapperParsingException[Mapping definition for [fields] has unsupported parameters: [_name : {}] [_content_type : {}]]; at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:381) at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:388) at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:231) at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:194) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: MapperParsingException[Mapping definition for [fields] has unsupported parameters: [_name : {}] [_content_type : {}]] at org.elasticsearch.index.mapper.DocumentMapperParser.checkNoRemainingFields(DocumentMapperParser.java:192) at org.elasticsearch.index.mapper.DocumentMapperParser.checkNoRemainingFields(DocumentMapperParser.java:186) at org.elasticsearch.index.mapper.attachment.AttachmentMapper$TypeParser.parse(AttachmentMapper.java:373) at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:310) at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseObjectOrDocumentTypeProperties(ObjectMapper.java:223) at org.elasticsearch.index.mapper.object.RootObjectMapper$TypeParser.parse(RootObjectMapper.java:139) at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:140) at org.elasticsearch.index.mapper.DocumentMapperParser.parseCompressed(DocumentMapperParser.java:121) at org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:391) at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:265) at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:378) ... 6 more [2016-06-22 06:51:58,836][INFO ][rest.suppressed ] /test_attachments Params: {index=test_attachments} MapperParsingException[Failed to parse mapping [document]: Mapping definition for [fields] has unsupported parameters: [_name : {}] [_content_type : {}]]; nested: MapperParsingException[Mapping definition for [fields] has unsupported parameters: [_name : {}] [_content_type : {}]]; at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:381) at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:388) at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:231) at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:194) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: MapperParsingException[Mapping definition for [fields] has unsupported parameters: [_name : {}] [_content_type : {}]] at org.elasticsearch.index.mapper.DocumentMapperParser.checkNoRemainingFields(DocumentMapperParser.java:192) at org.elasticsearch.index.mapper.DocumentMapperParser.checkNoRemainingFields(DocumentMapperParser.java:186) at org.elasticsearch.index.mapper.attachment.AttachmentMapper$TypeParser.parse(AttachmentMapper.java:373) at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseProperties(ObjectMapper.java:310) at org.elasticsearch.index.mapper.object.ObjectMapper$TypeParser.parseObjectOrDocumentTypeProperties(ObjectMapper.java:223) at org.elasticsearch.index.mapper.object.RootObjectMapper$TypeParser.parse(RootObjectMapper.java:139) at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:140) at org.elasticsearch.index.mapper.DocumentMapperParser.parseCompressed(DocumentMapperParser.java:121) at org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:391) at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:265) at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:378) ... 6 more
Just let you know another alternative here. You could use Apache Tika to read, extract text from PDF and index it by yourself. That's would be around 10 lines of code. As I understand it, this ES plugin actually uses Tika internally.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.