How to create index for a attachment of a doc in couchDB with ES?


(Goog Jobs) #1

THE method [http://www.elasticsearch.org/tutorials/2011/07/18/
attachment-type-in-action.html] is ok,but not simple.
is there a simple way. Thanks!


(Clinton Gormley) #2

On Wed, 2011-11-30 at 05:52 -0800, cheng wrote:

THE method [http://www.elasticsearch.org/tutorials/2011/07/18/
attachment-type-in-action.html] is ok,but not simple.
is there a simple way. Thanks!

Cheng, if you want a useful response, you're really going to have to ask
specific questions which explain

  • what you've tried,
  • the problem you've had,
  • and what you want to know.

Currently, my only useful answer to your questions would be: hire
somebody to do this for you.

have a look at http://www.elasticsearch.org/help for more information
about how to ask questions which are likely to get responses

clint


(Goog Jobs) #3

Thanks. i have tried the Attachment Type in Action,but the method
can't index the attachments in couchDB automatically . i want a way to
make it .

On Nov 30, 10:01 pm, Clinton Gormley cl...@traveljury.com wrote:

On Wed, 2011-11-30 at 05:52 -0800, cheng wrote:

THE method [http://www.elasticsearch.org/tutorials/2011/07/18/
attachment-type-in-action.html] is ok,but not simple.
is there a simple way. Thanks!

Cheng, if you want a useful response, you're really going to have to ask
specific questions which explain

  • what you've tried,
  • the problem you've had,
  • and what you want to know.

Currently, my only useful answer to your questions would be: hire
somebody to do this for you.

have a look athttp://www.elasticsearch.org/helpfor more information
about how to ask questions which are likely to get responses

clint


(David Pilato) #4

I would say that it remains a TODO for that in https://github.com/elasticsearch/elasticsearch/blob/master/plugins/river/couchdb/src/main/java/org/elasticsearch/river/couchdb/CouchdbRiver.java
Line 271

I don't think it's difficult to implement it... I probably work on it if nobody code it in the next months...

David :wink:
@dadoonet

Le 30 nov. 2011 à 14:52, cheng googcheng@gmail.com a écrit :

THE method [http://www.elasticsearch.org/tutorials/2011/07/18/
attachment-type-in-action.html] is ok,but not simple.
is there a simple way. Thanks!


(Goog Jobs) #5

I'M a beginner,but i would like to build a search engine to index
documents like .pdf .doc etc. Now i don't no how to do it .may be have
to halt it.
i use tapirwiki(a couchapp) to upload the docs.the method can't index
the attachments in couchDB automatically . man! expect your code .

best regards @funse [weibo.com]

On Nov 30, 10:20 pm, David Pilato da...@pilato.fr wrote:

I would say that it remains a TODO for that inhttps://github.com/elasticsearch/elasticsearch/blob/master/plugins/ri...
Line 271

I don't think it's difficult to implement it... I probably work on it if nobody code it in the next months...

David :wink:
@dadoonet

Le 30 nov. 2011 à 14:52, cheng googch...@gmail.com a écrit :

THE method [http://www.elasticsearch.org/tutorials/2011/07/18/
attachment-type-in-action.html] is ok,but not simple.
is there a simple way. Thanks!


(Goog Jobs) #6

Hi,man ! do u finish the code ? could u share it with me ! Thanks very
much!

在 2011年11月30日星期三,David Pilato 写道:

I would say that it remains a TODO for that in
https://github.com/elasticsearch/elasticsearch/blob/master/plugins/river/couchdb/src/main/java/org/elasticsearch/river/couchdb/CouchdbRiver.java
Line 271

I don't think it's difficult to implement it... I probably work on it if
nobody code it in the next months...

David :wink:
@dadoonet

Le 30 nov. 2011 à 14:52, cheng <googcheng@gmail.com <javascript:;>> a
écrit :

THE method [http://www.elasticsearch.org/tutorials/2011/07/18/
attachment-type-in-action.html] is ok,but not simple.
is there a simple way. Thanks!


(David Pilato) #7

Hi there,

I just finished something to deal with couchDb attachments using elasticsearch-mapper-attachments.

Before going further, is it possible for you to fork my code [1], compile it and launch the main test class CouchdbRiverBinaryAttachementTest and send some docs with one or more attachments and see if you can search for it ?

I tried with a very simple PDF file and it seems to work fine.

I start to write a little documentation about it [2] (see at the end).

You can also download the plugin [3] and install it instead of the previous couchDb plugin.

BTW, you should have installed before the elasticsearch-mapper-attachments plugin [4].

Please let me know if it’s working or not for you.

Cheers,

David.

[1] https://github.com/dadoonet/elasticsearch-river-couchdb/tree/attachments

[2] https://github.com/dadoonet/elasticsearch.github.com/blob/b77ebec4e44c5d794d68cfd3c79fd2b3db2b120c/guide/reference/river/couchdb.textile

[3] https://github.com/downloads/dadoonet/elasticsearch-river-couchdb/elasticsearch-river-couchdb-1.1.0-SNAPSHOT.zip

[4] https://github.com/elasticsearch/elasticsearch-mapper-attachments

De : elasticsearch@googlegroups.com [mailto:elasticsearch@googlegroups.com] De la part de goog cheng
Envoyé : mardi 20 décembre 2011 08:08
À : elasticsearch@googlegroups.com
Objet : Re: how to create index for a attachment of a doc in couchDB with ES?

Hi,man ! do u finish the code ? could u share it with me ! Thanks very much!

在 2011年11月30日星期三,David Pilato 写道:

I would say that it remains a TODO for that in https://github.com/elasticsearch/elasticsearch/blob/master/plugins/river/couchdb/src/main/java/org/elasticsearch/river/couchdb/CouchdbRiver.java
Line 271

I don't think it's difficult to implement it... I probably work on it if nobody code it in the next months...

David :wink:
@dadoonet

Le 30 nov. 2011 à 14:52, cheng <googcheng@gmail.com <javascript:;> > a écrit :

THE method [http://www.elasticsearch.org/tutorials/2011/07/18/
attachment-type-in-action.html] is ok,but not simple.
is there a simple way. Thanks!


(David Pilato) #8

Did anyone test it ?

BTW, I updated the README file :
https://github.com/dadoonet/elasticsearch-river-couchdb/blob/attachments/README.md
Please let me know (CouchDB river users) if there is any regression or
if I can submit the pull request.

Thanks,
David.

On 22 déc, 00:11, "David Pilato" da...@pilato.fr wrote:

Hi there,

I just finished something to deal with couchDb attachments using elasticsearch-mapper-attachments.

Before going further, is it possible for you to fork my code [1], compile it and launch the main test class CouchdbRiverBinaryAttachementTest and send some docs with one or more attachments and see if you can search for it ?

I tried with a very simple PDF file and it seems to work fine.

I start to write a little documentation about it [2] (see at the end).

You can also download the plugin [3] and install it instead of the previous couchDb plugin.

BTW, you should have installed before the elasticsearch-mapper-attachments plugin [4].

Please let me know if it’s working or not for you.

Cheers,

David.

[1]https://github.com/dadoonet/elasticsearch-river-couchdb/tree/attachments
[2]https://github.com/dadoonet/elasticsearch.github.com/blob/b77ebec4e44...
[3]https://github.com/downloads/dadoonet/elasticsearch-river-couchdb/ela...
[4]https://github.com/elasticsearch/elasticsearch-mapper-attachments


(dungtc) #9

Hello
I have tested it but it doesnot work well.

I attach 4 files to the 2 couchdb documents like that:
{
"_id": "Doc1",
"_rev": "5-4d607b7d88985097462ae9b2f67bc5ac",
"message": "Elastic Search",
"_attachments": {
"exam.docx": {
"content_type": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"revpos": 4,
"digest": "md5-ecdBcsbc6w7mC1EOgd5SIg==",
"length": 10007,
"stub": true
},
"2230681.pdf": {
"content_type": "application/pdf",
"revpos": 2,
"digest": "md5-BUhqhHiVqKybxrfGsQTixQ==",
"length": 956146,
"stub": true
}
}
}

{
"_id": "Doc2",
"_rev": "7-dd58025abc2002566b6f458ad3d83d4d",
"message": test attachments",
"_attachments": {
"TestAttachments.txt": {
"content_type": "text/plain",
"revpos": 6,
"digest": "md5-aLTD+adMRHPw2+WMIN/42Q==",
"length": 89,
"stub": true
},
"DynamicPublishingUseCases.doc": {
"content_type": "application/msword",
"revpos": 2,
"digest": "md5-FRdhydLr57C+q3ff6xLEmA==",
"length": 22528,
"stub": true
}
}
}

Here is my test with elastic search:
curl -X
PUT "localhost:9200/test_idx_couchdb_attachments"
{"ok":true,"acknowledged":true}

curl -XPUT 'http://localhost:9200/_river/test_river_couchdb_attachments/_meta' -d
'{"type" : "couchdb", "couchdb" : {"host" : "localhost","port" :
5984,"db" : "my_test_couchdb_attachments","filter" :
null,"ignore_attachments":false}},"index" : {"index" :
"test_idx_couchdb_attachments", "type" :
"test_mapping_couchdb_attachments" } }'
{"ok":true,"_index":"_river","_type":"test_river_couchdb_attachments","_id":"_meta","_version":1}

At first, I type:

curl -X PUT http://127.0.0.1:9200/my_test_couchdb_attachments/my_test_couchdb_attachments/_mapping -d '{
"my_test_couchdb_attachments": {
"properties": {
"_attachments": {
"properties": {
"2230681.pdf": {
"type": "attachment", "index" : "analyzed"
},
"DynamicPublishingUseCases.doc": {
"type": "attachment", "index" : "analyzed"
},
"TestAttachments.txt": {
"type": "attachment", "index" : "analyzed"
},
"exam.docx": {
"type": "attachment", "index" : "analyzed"
}
}
},
"message" : {
"type": "string", "index" : "analyzed"
}
}
}
}'
and I receive results:
{"error":"MergeMappingException[Merge failed with failures {[Can't merge a non object mapping [TestAttachments.txt] with an object mapping [TestAttachments.txt], Can't merge a non object mapping [2230681.pdf] with an object mapping [2230681.pdf], Can't merge a non object mapping [exam.docx] with an object mapping [exam.docx], Can't merge a non object mapping [DynamicPublishingUseCases.doc] with an object mapping [DynamicPublishingUseCases.doc]]}]

So I change and I succeed:
curl -X PUT http://127.0.0.1:9200/my_test_couchdb_attachments/my_test_couchdb_attachments/_mapping -d '{
"my_test_couchdb_attachments": {
"properties": {
"_attachments": {
"properties": {
""2230681.pdf"": {
"type": "attachment", "index" : "analyzed"
},
""DynamicPublishingUseCases.doc"": {
"type": "attachment", "index" : "analyzed"
},
""TestAttachments.txt"": {
"type": "attachment", "index" : "analyzed"
},
""exam.docx"": {
"type": "attachment", "index" : "analyzed"
}
}
},
"message" : {
"type": "string", "index" : "analyzed"
}
}
}
}'
{"ok":true,"acknowledged":true}

curl -XPOST 'http://localhost:9200/my_test_couchdb_attachments/my_test_couchdb_attachments/_search?pretty=true' -d '{"query" : {"wildcard" : { "_all" : "*" } } }'
This query works well by returning two documents

these queries donot work well with errors or no expected results:

curl -XGET 'http://localhost:9200/my_test_couchdb_attachments/my_test_couchdb_attachments/_search' -d '{"query" : {"text" : { "_attachments."2230681.pdf".content" : "Temperature" } } }'
{"took":0,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}

curl -XGET 'http://localhost:9200/my_test_couchdb_attachments/my_test_couchdb_attachments/_search' -d '{"query" : {"text" : { "_attachments.[2230681.pdf] : "Temperature" } } }'
{"error":"SearchPhaseExecutionException[Failed to execute phase [query], total failure; shardFailures {[CHYAFYCERMGHlBvKHiEagA][my_test_couchdb_attachments][3]: SearchParseException[[my_test_couchdb_attachments][3]: from[-1],size[-1]: Parse Failure [Failed to parse source [{"query" : {"text" : { "_attachments.[2230681.pdf] : "Temperature" } } }]]]; nested: QueryParsingException[[my_test_couchdb_attachments] Failed to parse]; nested: JsonParseException[Unexpected character ('T' (code 84)): was expecting a colon to separate field name and value\n at [Source: [B@582a85; line: 1, column: 56]]; }]

curl -XPOST 'http://localhost:9200/my_test_couchdb_attachments/my_test_couchdb_attachments/_search?pretty=true' -d '{"query" : {"text" : { "_attachments."DynamicPublishingUseCases.doc"" : "Rendering" } } }'
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}

curl -XPOST 'http://localhost:9200/my_test_couchdb_attachments/my_test_couchdb_attachments/_search?pretty=true' -d '{"query" : {"text_phrase" : { "_attachments."TestAttachments.txt"" : "Couchdb" } } }'
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}

I have used Elastic search 0.18.4 with river Couchdb 1.1.0 and mapper-attachments plugins 1.1.0
Please explain to me and how to index couchdb attachments and make it searchable?
Maybe I should submit the pdf files whose content is encoded with base64? All my files in my test arenot encoded.
Thanks a lot



De : David Pilato david@pilato.fr
À : elasticsearch elasticsearch@googlegroups.com
Envoyé le : Mercredi 28 Décembre 2011 14h12
Objet : Re: how to create index for a attachment of a doc in couchDB with ES?

Did anyone test it ?

BTW, I updated the README file :
https://github.com/dadoonet/elasticsearch-river-couchdb/blob/attachments/README.md
Please let me know (CouchDB river users) if there is any regression or
if I can submit the pull request.

Thanks,
David.

On 22 déc, 00:11, "David Pilato" da...@pilato.fr wrote:

Hi there,

I just finished something to deal with couchDb attachments using elasticsearch-mapper-attachments.

Before going further, is it possible for you to fork my code [1], compile it and launch the main test class CouchdbRiverBinaryAttachementTest and send some docs with one or more attachments and see if you can search for it ?

I tried with a very simple PDF file and it seems to work fine.

I start to write a little documentation about it [2] (see at the end).

You can also download the plugin [3] and install it instead of the previous couchDb plugin.

BTW, you should have installed before the elasticsearch-mapper-attachments plugin [4].

Please let me know if it’s working or not for you.

Cheers,

David.

[1]https://github.com/dadoonet/elasticsearch-river-couchdb/tree/attachments
[2]https://github.com/dadoonet/elasticsearch.github.com/blob/b77ebec4e44...
[3]https://github.com/downloads/dadoonet/elasticsearch-river-couchdb/ela...
[4]https://github.com/elasticsearch/elasticsearch-mapper-attachments


(Goog Jobs) #10

sorry,few days ago, i was busy to take examinations.I will do it now.

On Dec 22 2011, 7:11 am, "David Pilato" da...@pilato.fr wrote:

Hi there,

I just finished something to deal with couchDb attachments using elasticsearch-mapper-attachments.

Before going further, is it possible for you to fork my code [1], compile it and launch the main test class CouchdbRiverBinaryAttachementTest and send some docs with one or more attachments and see if you can search for it ?

I tried with a very simple PDF file and it seems to work fine.

I start to write a little documentation about it [2] (see at the end).

You can also download the plugin [3] and install it instead of the previous couchDb plugin.

BTW, you should have installed before the elasticsearch-mapper-attachments plugin [4].

Please let me know if it’s working or not for you.

Cheers,

David.

[1]https://github.com/dadoonet/elasticsearch-river-couchdb/tree/attachments

[2]https://github.com/dadoonet/elasticsearch.github.com/blob/b77ebec4e44...

[3]https://github.com/downloads/dadoonet/elasticsearch-river-couchdb/ela...

[4]https://github.com/elasticsearch/elasticsearch-mapper-attachments

De : elasticsearch@googlegroups.com [mailto:elasticsearch@googlegroups.com] De la part de goog cheng
Envoyé : mardi 20 décembre 2011 08:08
À : elasticsearch@googlegroups.com
Objet : Re: how to create index for a attachment of a doc in couchDB with ES?

Hi,man ! do u finish the code ? could u share it with me ! Thanks very much!

在 2011年11月30日星期三,David Pilato 写道:

I would say that it remains a TODO for that inhttps://github.com/elasticsearch/elasticsearch/blob/master/plugins/ri...
Line 271

I don't think it's difficult to implement it... I probably work on it if nobody code it in the next months...

David :wink:
@dadoonet

Le 30 nov. 2011 à 14:52, cheng <googch...@gmail.com <javascript:;> > a écrit :

THE method [http://www.elasticsearch.org/tutorials/2011/07/18/
attachment-type-in-action.html] is ok,but not simple.
is there a simple way. Thanks!


(Goog Jobs) #11

man,it doesn't work well. the content of the attachement after
searching is chars which have no meanings,like
"kAGEkADEkACokAUEkAAAzAEIqAE9KAwBRSgMAQ0oYAHN".

On Dec 28 2011, 9:12 pm, David Pilato da...@pilato.fr wrote:

Did anyone test it ?

BTW, I updated the README file :https://github.com/dadoonet/elasticsearch-river-couchdb/blob/attachme...
Please let me know (CouchDB river users) if there is any regression or
if I can submit the pull request.

Thanks,
David.

On 22 déc, 00:11, "David Pilato" da...@pilato.fr wrote:

Hi there,

I just finished something to deal with couchDb attachments using elasticsearch-mapper-attachments.

Before going further, is it possible for you to fork my code [1], compile it and launch the main test class CouchdbRiverBinaryAttachementTest and send some docs with one or more attachments and see if you can search for it ?

I tried with a very simple PDF file and it seems to work fine.

I start to write a little documentation about it [2] (see at the end).

You can also download the plugin [3] and install it instead of the previous couchDb plugin.

BTW, you should have installed before the elasticsearch-mapper-attachments plugin [4].

Please let me know if it’s working or not for you.

Cheers,

David.

[1]https://github.com/dadoonet/elasticsearch-river-couchdb/tree/attachments
[2]https://github.com/dadoonet/elasticsearch.github.com/blob/b77ebec4e44...
[3]https://github.com/downloads/dadoonet/elasticsearch-river-couchdb/ela...
[4]https://github.com/elasticsearch/elasticsearch-mapper-attachments


(David Pilato) #12

It's BASE64 encoded but you can perform searches.
Search for a term which is in your doc, you should find your doc.

David :wink:
@dadoonet

Le 7 janv. 2012 à 11:10, Goog Cheng googcheng@gmail.com a écrit :

man,it doesn't work well. the content of the attachement after
searching is chars which have no meanings,like
"kAGEkADEkACokAUEkAAAzAEIqAE9KAwBRSgMAQ0oYAHN".

On Dec 28 2011, 9:12 pm, David Pilato da...@pilato.fr wrote:

Did anyone test it ?

BTW, I updated the README file :https://github.com/dadoonet/elasticsearch-river-couchdb/blob/attachme...
Please let me know (CouchDB river users) if there is any regression or
if I can submit the pull request.

Thanks,
David.

On 22 déc, 00:11, "David Pilato" da...@pilato.fr wrote:

Hi there,

I just finished something to deal with couchDb attachments using elasticsearch-mapper-attachments.

Before going further, is it possible for you to fork my code [1], compile it and launch the main test class CouchdbRiverBinaryAttachementTest and send some docs with one or more attachments and see if you can search for it ?

I tried with a very simple PDF file and it seems to work fine.

I start to write a little documentation about it [2] (see at the end).

You can also download the plugin [3] and install it instead of the previous couchDb plugin.

BTW, you should have installed before the elasticsearch-mapper-attachments plugin [4].

Please let me know if it’s working or not for you.

Cheers,

David.

[1]https://github.com/dadoonet/elasticsearch-river-couchdb/tree/attachments
[2]https://github.com/dadoonet/elasticsearch.github.com/blob/b77ebec4e44...
[3]https://github.com/downloads/dadoonet/elasticsearch-river-couchdb/ela...
[4]https://github.com/elasticsearch/elasticsearch-mapper-attachments


(David Pilato) #13

When you get your hit, you have to decode it (Base64)

David :wink:
@dadoonet

Le 7 janv. 2012 à 11:10, Goog Cheng googcheng@gmail.com a écrit :

man,it doesn't work well. the content of the attachement after
searching is chars which have no meanings,like
"kAGEkADEkACokAUEkAAAzAEIqAE9KAwBRSgMAQ0oYAHN".

On Dec 28 2011, 9:12 pm, David Pilato da...@pilato.fr wrote:

Did anyone test it ?

BTW, I updated the README file :https://github.com/dadoonet/elasticsearch-river-couchdb/blob/attachme...
Please let me know (CouchDB river users) if there is any regression or
if I can submit the pull request.

Thanks,
David.

On 22 déc, 00:11, "David Pilato" da...@pilato.fr wrote:

Hi there,

I just finished something to deal with couchDb attachments using elasticsearch-mapper-attachments.

Before going further, is it possible for you to fork my code [1], compile it and launch the main test class CouchdbRiverBinaryAttachementTest and send some docs with one or more attachments and see if you can search for it ?

I tried with a very simple PDF file and it seems to work fine.

I start to write a little documentation about it [2] (see at the end).

You can also download the plugin [3] and install it instead of the previous couchDb plugin.

BTW, you should have installed before the elasticsearch-mapper-attachments plugin [4].

Please let me know if it’s working or not for you.

Cheers,

David.

[1]https://github.com/dadoonet/elasticsearch-river-couchdb/tree/attachments
[2]https://github.com/dadoonet/elasticsearch.github.com/blob/b77ebec4e44...
[3]https://github.com/downloads/dadoonet/elasticsearch-river-couchdb/ela...
[4]https://github.com/elasticsearch/elasticsearch-mapper-attachments


(Goog Jobs) #14

I used the restful ,search the given word the attachment contains
it ,but no hit.

curl "localhost:9200/tapirwiki/tapirwiki_search?pretty=true" -d '{
"query" : {
"query_string" : {
"query" : "svm"
}
}
}'
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}

On Jan 7, 6:18 pm, David Pilato da...@pilato.fr wrote:

When you get your hit, you have to decode it (Base64)

David :wink:
@dadoonet

Le 7 janv. 2012 à 11:10, Goog Cheng googch...@gmail.com a écrit :

man,it doesn't work well. the content of the attachement after
searching is chars which have no meanings,like
"kAGEkADEkACokAUEkAAAzAEIqAE9KAwBRSgMAQ0oYAHN".

On Dec 28 2011, 9:12 pm, David Pilato da...@pilato.fr wrote:

Did anyone test it ?

BTW, I updated the README file :https://github.com/dadoonet/elasticsearch-river-couchdb/blob/attachme...
Please let me know (CouchDB river users) if there is any regression or
if I can submit the pull request.

Thanks,
David.

On 22 déc, 00:11, "David Pilato" da...@pilato.fr wrote:

Hi there,

I just finished something to deal with couchDb attachments using elasticsearch-mapper-attachments.

Before going further, is it possible for you to fork my code [1], compile it and launch the main test class CouchdbRiverBinaryAttachementTest and send some docs with one or more attachments and see if you can search for it ?

I tried with a very simple PDF file and it seems to work fine.

I start to write a little documentation about it [2] (see at the end).

You can also download the plugin [3] and install it instead of the previous couchDb plugin.

BTW, you should have installed before the elasticsearch-mapper-attachments plugin [4].

Please let me know if it’s working or not for you.

Cheers,

David.

[1]https://github.com/dadoonet/elasticsearch-river-couchdb/tree/attachments
[2]https://github.com/dadoonet/elasticsearch.github.com/blob/b77ebec4e44...
[3]https://github.com/downloads/dadoonet/elasticsearch-river-couchdb/ela...
[4]https://github.com/elasticsearch/elasticsearch-mapper-attachments


(Goog Jobs) #15

I use http://www.elasticsearch.org/tutorials/2010/08/01/couchb-integration.html
the quick steps, Does it have a little effect?

On Jan 7, 6:18 pm, David Pilato da...@pilato.fr wrote:

When you get your hit, you have to decode it (Base64)

David :wink:
@dadoonet

Le 7 janv. 2012 à 11:10, Goog Cheng googch...@gmail.com a écrit :

man,it doesn't work well. the content of the attachement after
searching is chars which have no meanings,like
"kAGEkADEkACokAUEkAAAzAEIqAE9KAwBRSgMAQ0oYAHN".

On Dec 28 2011, 9:12 pm, David Pilato da...@pilato.fr wrote:

Did anyone test it ?

BTW, I updated the README file :https://github.com/dadoonet/elasticsearch-river-couchdb/blob/attachme...
Please let me know (CouchDB river users) if there is any regression or
if I can submit the pull request.

Thanks,
David.

On 22 déc, 00:11, "David Pilato" da...@pilato.fr wrote:

Hi there,

I just finished something to deal with couchDb attachments using elasticsearch-mapper-attachments.

Before going further, is it possible for you to fork my code [1], compile it and launch the main test class CouchdbRiverBinaryAttachementTest and send some docs with one or more attachments and see if you can search for it ?

I tried with a very simple PDF file and it seems to work fine.

I start to write a little documentation about it [2] (see at the end).

You can also download the plugin [3] and install it instead of the previous couchDb plugin.

BTW, you should have installed before the elasticsearch-mapper-attachments plugin [4].

Please let me know if it’s working or not for you.

Cheers,

David.

[1]https://github.com/dadoonet/elasticsearch-river-couchdb/tree/attachments
[2]https://github.com/dadoonet/elasticsearch.github.com/blob/b77ebec4e44...
[3]https://github.com/downloads/dadoonet/elasticsearch-river-couchdb/ela...
[4]https://github.com/elasticsearch/elasticsearch-mapper-attachments


(Goog Jobs) #16

indeed ,I have got no hit.

On Jan 7, 6:18 pm, David Pilato da...@pilato.fr wrote:

When you get your hit, you have to decode it (Base64)

David :wink:
@dadoonet

Le 7 janv. 2012 à 11:10, Goog Cheng googch...@gmail.com a écrit :

man,it doesn't work well. the content of the attachement after
searching is chars which have no meanings,like
"kAGEkADEkACokAUEkAAAzAEIqAE9KAwBRSgMAQ0oYAHN".

On Dec 28 2011, 9:12 pm, David Pilato da...@pilato.fr wrote:

Did anyone test it ?

BTW, I updated the README file :https://github.com/dadoonet/elasticsearch-river-couchdb/blob/attachme...
Please let me know (CouchDB river users) if there is any regression or
if I can submit the pull request.

Thanks,
David.

On 22 déc, 00:11, "David Pilato" da...@pilato.fr wrote:

Hi there,

I just finished something to deal with couchDb attachments using elasticsearch-mapper-attachments.

Before going further, is it possible for you to fork my code [1], compile it and launch the main test class CouchdbRiverBinaryAttachementTest and send some docs with one or more attachments and see if you can search for it ?

I tried with a very simple PDF file and it seems to work fine.

I start to write a little documentation about it [2] (see at the end).

You can also download the plugin [3] and install it instead of the previous couchDb plugin.

BTW, you should have installed before the elasticsearch-mapper-attachments plugin [4].

Please let me know if it’s working or not for you.

Cheers,

David.

[1]https://github.com/dadoonet/elasticsearch-river-couchdb/tree/attachments
[2]https://github.com/dadoonet/elasticsearch.github.com/blob/b77ebec4e44...
[3]https://github.com/downloads/dadoonet/elasticsearch-river-couchdb/ela...
[4]https://github.com/elasticsearch/elasticsearch-mapper-attachments


(Goog Jobs) #17

I'm very sorry,long time no es,something i get error. Good job, It's
work well,I'm sure. Thanks very much!!!

On Jan 7, 6:18 pm, David Pilato da...@pilato.fr wrote:

When you get your hit, you have to decode it (Base64)

David :wink:
@dadoonet

Le 7 janv. 2012 à 11:10, Goog Cheng googch...@gmail.com a écrit :

man,it doesn't work well. the content of the attachement after
searching is chars which have no meanings,like
"kAGEkADEkACokAUEkAAAzAEIqAE9KAwBRSgMAQ0oYAHN".

On Dec 28 2011, 9:12 pm, David Pilato da...@pilato.fr wrote:

Did anyone test it ?

BTW, I updated the README file :https://github.com/dadoonet/elasticsearch-river-couchdb/blob/attachme...
Please let me know (CouchDB river users) if there is any regression or
if I can submit the pull request.

Thanks,
David.

On 22 déc, 00:11, "David Pilato" da...@pilato.fr wrote:

Hi there,

I just finished something to deal with couchDb attachments using elasticsearch-mapper-attachments.

Before going further, is it possible for you to fork my code [1], compile it and launch the main test class CouchdbRiverBinaryAttachementTest and send some docs with one or more attachments and see if you can search for it ?

I tried with a very simple PDF file and it seems to work fine.

I start to write a little documentation about it [2] (see at the end).

You can also download the plugin [3] and install it instead of the previous couchDb plugin.

BTW, you should have installed before the elasticsearch-mapper-attachments plugin [4].

Please let me know if it’s working or not for you.

Cheers,

David.

[1]https://github.com/dadoonet/elasticsearch-river-couchdb/tree/attachments
[2]https://github.com/dadoonet/elasticsearch.github.com/blob/b77ebec4e44...
[3]https://github.com/downloads/dadoonet/elasticsearch-river-couchdb/ela...
[4]https://github.com/elasticsearch/elasticsearch-mapper-attachments


(David Pilato) #18

Sorry. I did not understand.
Is it working for you or not ?

Did you try to put attachments in couchdb as I wrote it in the README here :
https://github.com/dadoonet/elasticsearch-river-couchdb/tree/attachments ?

I didn't test the river with inline attachments.

David.

-----Message d'origine-----
De : elasticsearch@googlegroups.com
[mailto:elasticsearch@googlegroups.com] De la part de Goog Cheng
Envoyé : samedi 7 janvier 2012 14:27
À : elasticsearch
Objet : Re: how to create index for a attachment of a doc in couchDB
with ES?

I'm very sorry,long time no es,something i get error. Good job, It's
work well,I'm sure. Thanks very much!!!


(Goog Jobs) #19

work for me! I'm a chinese student, the english is so-so , sorry!

On Jan 8, 12:20 am, "David Pilato" da...@pilato.fr wrote:

Sorry. I did not understand.
Is it working for you or not ?

Did you try to put attachments in couchdb as I wrote it in the README here :https://github.com/dadoonet/elasticsearch-river-couchdb/tree/attachments?

I didn't test the river with inline attachments.

David.

-----Message d'origine-----
De : elasticsearch@googlegroups.com
[mailto:elasticsearch@googlegroups.com] De la part de Goog Cheng
Envoyé : samedi 7 janvier 2012 14:27
À : elasticsearch
Objet : Re: how to create index for a attachment of a doc in couchDB
with ES?

I'm very sorry,long time no es,something i get error. Good job, It's
work well,I'm sure. Thanks very much!!!


(dungtc) #20

Hi
But it doesn'e work for me.
I receive the same results as the mail I have already sent (below)
Please explain it to me.
You can see, I donot have hits.
Did I make a mistake or not?
Thanks

----- Mail transféré -----
De : Chi Dung Tran dungtctin4@yahoo.com
À : "elasticsearch@googlegroups.com" elasticsearch@googlegroups.com
Envoyé le : Jeudi 29 Décembre 2011 16h42
Objet : Re : how to create index for a attachment of a doc in couchDB with ES?

Hello
I have tested it but it doesnot work well.

I attach 4 files to the 2 couchdb documents like that:
{
"_id": "Doc1",
"_rev": "5-4d607b7d88985097462ae9b2f67bc5ac",
"message": "Elastic Search",
"_attachments": {
"exam.docx": {
"content_type": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"revpos": 4,
"digest": "md5-ecdBcsbc6w7mC1EOgd5SIg==",
"length": 10007,
"stub": true
},
"2230681.pdf": {
"content_type": "application/pdf",
"revpos": 2,
"digest": "md5-BUhqhHiVqKybxrfGsQTixQ==",
"length": 956146,
"stub": true
}
}
}
{
"_id": "Doc2",
"_rev": "7-dd58025abc2002566b6f458ad3d83d4d",
"message": test attachments",
"_attachments": {
"TestAttachments.txt": {
"content_type": "text/plain",
"revpos": 6,
"digest": "md5-aLTD+adMRHPw2+WMIN/42Q==",
"length": 89,
"stub": true
},
"DynamicPublishingUseCases.doc": {
"content_type": "application/msword",
"revpos": 2,
"digest": "md5-FRdhydLr57C+q3ff6xLEmA==",
"length": 22528,
"stub": true
}
}
}

Here is my test with elastic search:
curl -X PUT "localhost:9200/test_idx_couchdb_attachments"
{"ok":true,"acknowledged":true}

curl -XPUT 'http://localhost:9200/_river/test_river_couchdb_attachments/_meta' -d '{"type" : "couchdb", "couchdb" : {"host" : "localhost","port" : 5984,"db" : "my_test_couchdb_attachments","filter" : null,"ignore_attachments":false}},"index" : {"index" : "test_idx_couchdb_attachments", "type" : "test_mapping_couchdb_attachments" } }'
{"ok":true,"_index":"_river","_type":"test_river_couchdb_attachments","_id":"_meta","_version":1}
At first, I type:
curl -X PUT http://127.0.0.1:9200/my_test_couchdb_attachments/my_test_couchdb_attachments/_mapping -d '{
"my_test_couchdb_attachments": {
"properties": {
"_attachments": {
"properties": {
"2230681.pdf": {
"type": "attachment", "index" : "analyzed"
},
"DynamicPublishingUseCases.doc": {
"type": "attachment", "index" : "analyzed"
},
"TestAttachments.txt": {
"type": "attachment", "index" : "analyzed"
},
"exam.docx": {
"type": "attachment", "index" : "analyzed"
}
}
},
"message" : {
"type": "string", "index" : "analyzed"
}
}
}
}'
and I receive results:
{"error":"MergeMappingException[Merge failed with failures {[Can't merge a non object mapping [TestAttachments.txt] with an object mapping [TestAttachments.txt], Can't merge a non object mapping [2230681.pdf] with an object mapping [2230681.pdf], Can't merge a non object mapping [exam.docx] with an object mapping [exam.docx], Can't merge a non object mapping [DynamicPublishingUseCases.doc] with an object mapping [DynamicPublishingUseCases.doc]]}]
So I change and I succeed:
curl -X PUT http://127.0.0.1:9200/my_test_couchdb_attachments/my_test_couchdb_attachments/_mapping -d '{
"my_test_couchdb_attachments": {
"properties": {
"_attachments": {
"properties": {
""2230681.pdf"": {
"type": "attachment", "index" : "analyzed"
},
""DynamicPublishingUseCases.doc"": {
"type": "attachment", "index" : "analyzed"
},
""TestAttachments.txt"": {
"type": "attachment", "index" : "analyzed"
},
""exam.docx"": {
"type": "attachment", "index" : "analyzed"
}
}
},
"message" : {
"type": "string", "index" : "analyzed"
}
}
}
}'
{"ok":true,"acknowledged":true}

curl -XPOST 'http://localhost:9200/my_test_couchdb_attachments/my_test_couchdb_attachments/_search?pretty=true' -d '{"query" : {"wildcard" : { "_all" : "*" } } }'
This query works well by returning two documents
these queries donot work well with errors or no expected results:curl -XGET 'http://localhost:9200/my_test_couchdb_attachments/my_test_couchdb_attachments/_search' -d '{"query" : {"text" : { "_attachments."2230681.pdf".content" : "Temperature" } } }'{"took":0,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}curl -XGET 'http://localhost:9200/my_test_couchdb_attachments/my_test_couchdb_attachments/_search' -d '{"query" : {"text" : { "_attachments.[2230681.pdf] : "Temperature" } } }'{"error":"SearchPhaseExecutionException[Failed to execute phase [query], total failure; shardFailures {[CHYAFYCERMGHlBvKHiEagA][my_test_couchdb_attachments][3]: SearchParseException[[my_test_couchdb_attachments][3]: from[-1],size[-1]: Parse Failure [Failed to parse source [{"query" : {"text" : { "_attachments.[2230681.pdf] : "Temperature" } } }]]]; nested:
QueryParsingException[[my_test_couchdb_attachments] Failed to parse]; nested: JsonParseException[Unexpected character ('T' (code 84)): was expecting a colon to separate field name and value\n at [Source: [B@582a85; line: 1, column: 56]]; }]curl -XPOST 'http://localhost:9200/my_test_couchdb_attachments/my_test_couchdb_attachments/_search?pretty=true' -d '{"query" : {"text" : { "_attachments."DynamicPublishingUseCases.doc"" : "Rendering" } } }'{ "took" : 0, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 0, "max_score" : null, "hits" : [ ] }}curl -XPOST 'http://localhost:9200/my_test_couchdb_attachments/my_test_couchdb_attachments/_search?pretty=true' -d '{"query" : {"text_phrase" : { "_attachments."TestAttachments.txt"" : "Couchdb" } } }'{ "took" : 0, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed"
: 0 }, "hits" : { "total" : 0, "max_score" : null, "hits" : [ ] }}I have used Elastic search 0.18.4 with river Couchdb 1.1.0 and mapper-attachments plugins 1.1.0Please explain to me and how to index couchdb attachments and make it searchable?Maybe I should submit the pdf files whose content is encoded with base64? All my files in my test arenot encoded.Thanks a lot



De : David Pilato david@pilato.fr
À : elasticsearch elasticsearch@googlegroups.com
Envoyé le : Mercredi 28 Décembre 2011 14h12
Objet : Re: how to create index for a attachment of a doc in couchDB with ES?
Did anyone test it ?BTW, I updated the README file :https://github.com/dadoonet/elasticsearch-river-couchdb/blob/attachments/README.mdPlease let me know (CouchDB river users) if there is any regression orif I can submit the pull request.Thanks,David.On 22 déc, 00:11, "David Pilato" da...@pilato.fr wrote:> Hi there,>> I just finished something to deal with couchDb attachments using elasticsearch-mapper-attachments.>> Before going further, is it possible for you to fork my code [1], compile it and launch the main test class CouchdbRiverBinaryAttachementTest and send some docs with one or more attachments and see if you can search for it ?>> I tried with a very simple PDF file and it seems to work fine.>> I start to write a little documentation about it [2] (see at the end).>> You can also download the plugin [3] and install it instead of the previous couchDb plugin.>> BTW, you should have installed before the elasticsearch-mapper-attachments plugin
[4].>> Please let me know if it’s working or not for you.>> Cheers,>> David.>> [1]https://github.com/dadoonet/elasticsearch-river-couchdb/tree/attachments> [2]https://github.com/dadoonet/elasticsearch.github.com/blob/b77ebec4e44...> [3]https://github.com/downloads/dadoonet/elasticsearch-river-couchdb/ela...> [4]https://github.com/elasticsearch/elasticsearch-mapper-attachments