Trouble with tutorial: Attachment Type in Action


(Meltemi) #1

Trying to successfully index a PDF with elasticsearch and failing.

Followed instructions in for tutorial: Attachment Type in Actionhttp://www.elasticsearch.org/tutorials/2011/07/18/attachment-type-in-action.html
:

Installed the Attachment Type plugin and got response: "Installed
mapper-attachments".

All's peachy until

curl -X POST "localhost:9200/test/attachment/" -d @json.file

in the shell script when the shell hangs.

Also tried using the gist https://gist.github.com/1075067 and it hangs on
exact same command.

"json.file" was successfully created (both times). it's Base64 gibberish.
So not sure if it's valid?!? But assuming so I tried to POST directly from
command line:

$ curl -X POST "localhost:9200/test/attachment/" -d json.file
{"error":"ElasticSearchParseException[Failed to derive xcontent from
(offset=0, length=9): [106, 115, 111, 110, 46, 102, 105, 108,
101]]","status":400}

log shows:

[2012-06-07 12:32:16,742][DEBUG][action.index ] [Bailey, Paul]
[test][0], node[AHLHFKBWSsuPnTIRVhNcuw], [P], s[STARTED]: Failed to execute
[index {[test][attachment][DauMB-vtTIaYGyKD4P8Y_w], source[json.file]}]
org.elasticsearch.ElasticSearchParseException: Failed to derive xcontent
from (offset=0, length=9): [106, 115, 111, 110, 46, 102, 105, 108, 101]
at
org.elasticsearch.common.xcontent.XContentFactory.xContent(XContentFactory.java:147)
at
org.elasticsearch.common.xcontent.XContentHelper.createParser(XContentHelper.java:50)
at
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:451)
at
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:437)
at
org.elasticsearch.index.shard.service.InternalIndexShard.prepareCreate(InternalIndexShard.java:290)
at
org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:210)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:532)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:430)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

which is beyond my pay-grade. Can anyone help me figure out how/where I
failed?

preesh!


(James Cook-3) #2

I'm not sure how to correct the problem you are experiencing, but:

$ curl -X POST "localhost:9200/test/attachment/" -d json.file
{"error":"ElasticSearchParseException[Failed to derive xcontent from
(offset=0, length=9): [106, 115, 111, 110, 46, 102, 105, 108,
101]]","status":400}

Is just telling you that ES, has received 'json.file' as some type of
incoming content that it does not understand how to parse. The contents of
'json.file' were not transmitted.

-- jim

On Thursday, June 7, 2012 3:38:05 PM UTC-4, Meltemi wrote:

Trying to successfully index a PDF with elasticsearch and failing.

Followed instructions in for tutorial: Attachment Type in Actionhttp://www.elasticsearch.org/tutorials/2011/07/18/attachment-type-in-action.html
:

Installed the Attachment Type plugin and got response: "Installed
mapper-attachments".

All's peachy until

curl -X POST "localhost:9200/test/attachment/" -d @json.file

in the shell script when the shell hangs.

Also tried using the gist https://gist.github.com/1075067 and it hangs
on exact same command.

"json.file" was successfully created (both times). it's Base64 gibberish.
So not sure if it's valid?!? But assuming so I tried to POST directly from
command line:

$ curl -X POST "localhost:9200/test/attachment/" -d json.file
{"error":"ElasticSearchParseException[Failed to derive xcontent from
(offset=0, length=9): [106, 115, 111, 110, 46, 102, 105, 108,
101]]","status":400}

log shows:

[2012-06-07 12:32:16,742][DEBUG][action.index ] [Bailey, Paul]
[test][0], node[AHLHFKBWSsuPnTIRVhNcuw], [P], s[STARTED]: Failed to execute
[index {[test][attachment][DauMB-vtTIaYGyKD4P8Y_w], source[json.file]}]
org.elasticsearch.ElasticSearchParseException: Failed to derive xcontent
from (offset=0, length=9): [106, 115, 111, 110, 46, 102, 105, 108, 101]
at
org.elasticsearch.common.xcontent.XContentFactory.xContent(XContentFactory.java:147)
at
org.elasticsearch.common.xcontent.XContentHelper.createParser(XContentHelper.java:50)
at
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:451)
at
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:437)
at
org.elasticsearch.index.shard.service.InternalIndexShard.prepareCreate(InternalIndexShard.java:290)
at
org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:210)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:532)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:430)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

which is beyond my pay-grade. Can anyone help me figure out how/where I
failed?

preesh!


(Meltemi) #3

Well, the json.file was encoded (as per the tutorial) with:

coded=cat fn6742.pdf | perl -MMIME::Base64 -ne 'print encode_base64($_)'
json="{"file":"${coded}"}"
echo "$json" > json.file

and there were no errors until the next line (where it hangs):

curl -X POST "${host}/test/attachment/" -d @json.file

The tutorial wants the file to be converted into Base64. Is this always the
case? If so, is there another way to do so?

I'd really like to get PDFs (even just one) indexed with elasticsearch so I
can start playing with it.

Thanks for your help!

On Thursday, June 7, 2012 1:49:55 PM UTC-7, James Cook wrote:

I'm not sure how to correct the problem you are experiencing, but:

$ curl -X POST "localhost:9200/test/attachment/" -d json.file
{"error":"ElasticSearchParseException[Failed to derive xcontent from
(offset=0, length=9): [106, 115, 111, 110, 46, 102, 105, 108,
101]]","status":400}

Is just telling you that ES, has received 'json.file' as some type of
incoming content that it does not understand how to parse. The contents of
'json.file' were not transmitted.

-- jim

On Thursday, June 7, 2012 3:38:05 PM UTC-4, Meltemi wrote:

Trying to successfully index a PDF with elasticsearch and failing.

Followed instructions in for tutorial: Attachment Type in Actionhttp://www.elasticsearch.org/tutorials/2011/07/18/attachment-type-in-action.html
:

Installed the Attachment Type plugin and got response: "Installed
mapper-attachments".

All's peachy until

curl -X POST "localhost:9200/test/attachment/" -d @json.file

in the shell script when the shell hangs.

Also tried using the gist https://gist.github.com/1075067 and it hangs
on exact same command.

"json.file" was successfully created (both times). it's Base64 gibberish.
So not sure if it's valid?!? But assuming so I tried to POST directly from
command line:

$ curl -X POST "localhost:9200/test/attachment/" -d json.file
{"error":"ElasticSearchParseException[Failed to derive xcontent from
(offset=0, length=9): [106, 115, 111, 110, 46, 102, 105, 108,
101]]","status":400}

log shows:

[2012-06-07 12:32:16,742][DEBUG][action.index ] [Bailey,
Paul] [test][0], node[AHLHFKBWSsuPnTIRVhNcuw], [P], s[STARTED]: Failed to
execute [index {[test][attachment][DauMB-vtTIaYGyKD4P8Y_w],
source[json.file]}]
org.elasticsearch.ElasticSearchParseException: Failed to derive xcontent
from (offset=0, length=9): [106, 115, 111, 110, 46, 102, 105, 108, 101]
at
org.elasticsearch.common.xcontent.XContentFactory.xContent(XContentFactory.java:147)
at
org.elasticsearch.common.xcontent.XContentHelper.createParser(XContentHelper.java:50)
at
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:451)
at
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:437)
at
org.elasticsearch.index.shard.service.InternalIndexShard.prepareCreate(InternalIndexShard.java:290)
at
org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:210)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:532)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:430)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)

which is beyond my pay-grade. Can anyone help me figure out how/where I
failed?

preesh!


(system) #4