Searching attachment content with ingest attachment plugin ES 5.2

Hi All,

I am facing a issue while searching the attachment data: I have installed the ingest attachment plugin, created the for each processor, "being the attachment as array"; the mapping goes as:

"attachment": {
"properties": {
"attachment_data": {
"type": "text",
"store": true,
"term_vector": "with_positions_offsets"
},
"attachment_id": {
"type": "text",
"store": true,
"term_vector": "with_positions_offsets"
},
}
}

one of the record has data as:

"attachment": {

           "content_type": "xyz",
           "language": "it",
           "content": "xyz",
           "attachment_data":"base64 encoded"
           "attachment_id":"xxxxx"

}

but when I search "xyz", it gives no record, search is:

"query": {
"bool" : {
"must": [{ "query_string" : {
"fields": ["_all"],
"query": "xyz"
}}],

I have tried with:

"query": {
"bool" : {
"must": [{ "query_string" : {
"fields": ["attachment.attachment_data"],
"query": "xyz"
}}],

or even:
"query": {
"bool" : {
"must": [{ "query_string" : {
"fields": ["attachment.content"],
"query": "xyz"
}}],

but everytime, a "0" result.

any help is appreciated.

Best,
Divya

Please format your code using </> icon as explained in this guide. It will make your post more readable.

Or use markdown style like:

```
CODE
```

If you provide a full recreation script it can be easier to help.

</>"attachment": {

       "content_type": "xyz",
       "language": "it",
       "content": "xyz",
       "attachment_data":"base64 encoded"
       "attachment_id":"xxxxx"

}</>

but when I search "xyz", it gives no record, search is:

</>"query": {
"bool" : {
"must": [{ "query_string" : {
"fields": ["_all"],
"query": "xyz"
}}],</>

I have tried with:

</>"query": {
"bool" : {
"must": [{ "query_string" : {
"fields": ["attachment.attachment_data"],
"query": "xyz"
}}],</>

or even:
</>"query": {
"bool" : {
"must": [{ "query_string" : {
"fields": ["attachment.content"],
"query": "xyz"
}}],</>

Appreciate your help

Did you read my answer?

the mapping is:

               "attachment": {
                  "properties": {
                     "attachment_data": {
                        "type": "text",
                        "store": true,
                        "term_vector": "with_positions_offsets",
                        "fielddata": true
                     },
                     "attachment_id": {
                        "type": "text",
                        "store": true,
                        "term_vector": "with_positions_offsets"
                     },

Ingest attachment plugin was used, processor and pipeline has been created as:

"processors": [
         {
            "foreach": {
               "field": "attachment",
               "processor": {
                  "attachment": {
                     "target_field": "_ingest._value.attachment",
                     "field": "_ingest._value.attachment_data"
                  }
               }
            }
 

I am not able to search the content, though it can be seen when /_search is used, but unable to search in match query with attachment.attachment.data field. my search is:

  "query": {
    "match": {
      "attachment.attachment.content":"word"
    }
  }

apologies for inconvenience in the format.

Best,
Divya

If you provide a full recreation script it can be easier to help.

How can I replay what you are doing without a script?

As explained in About the Elasticsearch category, provide something like:

DELETE index
PUT index/type/1
{
  "foo": "bar"
}
GET index/type/_search
{
  "query": {
    "match": {
      "foo": "bar"
    }
  }
}

Please try with the minimal settings/mappings/content...
If this forum rejects your post because of the number of characters, you can post your full script on gist.github.com and paste the link here.

Thanks David for the consistent acknowledgement to my problem.
But my issue comes out to be related to:

https://www.elastic.co/guide/en/elasticsearch/plugins/master/ingest-attachment-with-arrays.html

I am unable to search any time inside content of the attachment which is decoded.

``
"attachments" : [
{
"filename" : "ipsum.txt",
"data" : "dGhpcyBpcwpqdXN0IHNvbWUgdGV4dAo=",
"attachment" : {
"content_type" : "text/plain; charset=ISO-8859-1",
"language" : "en",
"content" :

"this is\njust some text",

"content_length" : 24
}
}
``

I cannot use match query to search the "just" keyword.

though filename in the same is searchable with "attachments.filename"

Please let me know if this helps in understanding the use-case.

Best,
Divya

Why did you open a new discussion? Can you remove it?

Can you please provide a full script I can use to reproduce locally your problem?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.