Defining Elasticsearch mapping for ingest-attachment inner field

I'm trying to build an application using an Elasticsearch index. I have several "inner" fields which can contain binary data (mainly PDF), and I'm looking for the best way to define my pipeline and mapping, given the facts that:

  • all fields and contents can be provided in several languages (french and english) and in several fields
  • I have to be able to query contents for a given language and/or for a given field.

This is how I defined my mapping until now:

{
    "WfNewsEvent": {
        "properties": {
            "title": {
                "type": "object",
                "properties": {
                    "en": {
                        "type": "string"
                    },
                    "fr": {
                        "type": "string",
                        "analyzer": "french",
                        "search_analyzer": "french_search"
                    }
                }
            },
            ...
            "extfile": {
                "type": "object",
                "properties": {
                    "title": {
                        "type": "object",
                        "properties": {
                            "en": {
                                "type": "string"
                            },
                            "fr": {
                                "type": "string",
                                "analyzer": "french",
                                "search_analyzer": "french_search"
                            }
                        }
                    },
                    "description": {
                        "type": "object",
                        "properties": {
                            "en": {
                                "type": "string"
                            },
                            "fr": {
                                "type": "string",
                                "analyzer": "french",
                                "search_analyzer": "french_search"
                            }
                        }
                    },
                    "data": {
                        "type": "object",
                        "properties": {
                            "en": {
                                "type": "attachment"
                            },
                            "fr": {
                                "type": "attachment",
                                "analyzer": "french",
                                "search_analyzer": "french_search"
                            }
                        }
                    }
                }
            },
            "gallery": {
                "type": "object",
                "properties": {
                    "title": {
                        "type": "object",
                        "properties": {
                            "en": {
                                "type": "string"
                            },
                            "fr": {
                                "type": "string",
                                "analyzer": "french",
                                "search_analyzer": "french_search"
                            }
                        }
                    },
                    "description": {
                        "type": "object",
                        "properties": {
                            "en": {
                                "type": "string"
                            },
                            "fr": {
                                "type": "string",
                                "analyzer": "french",
                                "search_analyzer": "french_search"
                            }
                        }
                    },
                    "data": {
                        "type": "object",
                        "properties": {
                            "en": {
                                "type": "attachment"
                            },
                            "fr": {
                                "type": "attachment",
                                "analyzer": "french",
                                "search_analyzer": "french_search"
                            }
                        }
                    }
                }
            }
        }
    }
}

Then my 'attachment' pipeline definition:

{
  "description" : "Extract attachment information",
  "processors" : [
    {
      "attachment" : {
        "field" : "extfile.data.en",
        "ignore_missing": true
      }
    },
    {
      "attachment" : {
        "field" : "extfile.data.fr",
        "ignore_missing": true
      }
    },
    {
      "attachment" : {
        "field" : "gallery.data.fr",
        "ignore_missing": true
      }
    },
    {
      "attachment" : {
        "field" : "gallery.data.fr",
        "ignore_missing": true
      }
    }
  ]
}

Actually when I'm trying to index a document, ES raises an exception saying that "data" is not an integer. So any help would be greatly welcome!

Best regards,
Thierry

You should not use the mapper attachments plugin anymore as it has been deprecated and will be removed in 6.0.

Use only ingest-attachment instead.

So basically don't use:

"type": "attachment"

Hi David,
I finally made the ingest-attachment plug-in work...
But I wanted to handle the "attachment" property as an inner object property (like "extfile.data.fr") but the only solution I found until now was to make my "attachment" property a first-level one.
Is there a solution to define ingest-attachment pipeline for inner properties?

Yes. This should work on inner fields.

Do you have a full non working example ?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.