Attachments questions


(rtulloh) #1

I am trying to index an attachment PDF. I am using 0.17.0 (built from source). My index and mapping is created like this:

curl -XPOST 10.181.18.66:9200/test -d '{
"settings" : {
"number_of_shards" : 1
},
"mappings" : {
"docitem" : {
"properties" : {
"my_attachment" : { "type" : "attachment" }
}
}
}
}'

I have installed the mapper-attachment and analysis-icu plugins.

[2011-06-02 17:43:45,571][INFO ][plugins ] [Day, Wilbur] loaded [mapper-attachments, analysis-icu], sites []

I then wrote a small java program to feed the document of interest. I tried sending in raw bytes and also sending in base64 encoded bytes when I called the API. Here is the relevant code:

public void index(Client client, String filename) throws ElasticSearchException, IOException {

IndexResponse response = client.prepareIndex("test", "docitem", "1")
    .setSource(jsonBuilder()
                .startObject()
                    .field("_content_type", "application/pdf")
                    .field("_name", filename)
                    .field("attachment", getBytes(filename)) // getBase64(filename)
                .endObject()
              )
    .execute()
    .actionGet();

}

When I search for the document, I don't see any of the attachment meta-data being created. Also, searches for any words in the content don't result in any hits. Am I missing something?

"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [ {
"_index" : "test",
"_type" : "docitem",
"_id" : "1",
"_score" : 1.0, "_source" : {"_content_type":"application/pdf","_name":"test.pdf","attachment":"JVBERi0xLjQKJaqrrK0KNCAwIG9iago8PAovUHJvZHVjZXIgKEFwYWN


(rtulloh) #2

Whoops, I had to change 'attachment' to 'my_attachment' in the Java code and now things seem to be searchable as I would expect. Sorry for the bother.

public void index(Client client, String filename) throws ElasticSearchException, IOException {

IndexResponse response = client.prepareIndex("test", "docitem", "1")
    .setSource(jsonBuilder()
                .startObject()
                    .field("_content_type", "application/pdf")
                    .field("_name", filename)
                    .field("<b>my_attachment</b>", getBase64(filename))
                .endObject()
              )
    .execute()
    .actionGet();

}


(system) #3