Duplicate metadata from _id into a new field

mvasqueznr · July 10, 2024, 5:07am

I have a Filebeat data pipeline that ingests metrics into an Elasticsearch index. In a separate Python application, I fetch some of this data using the SQL API to display it in a web application. The problem arises when I need to perform bulk edits. I have the BULK API configured for this process, but one of the requirements is to provide the _id metadata. However, I cannot extract this field in the query.

I tried the following solution:


PUT _ingest/pipeline/cm_metadata_fields
{
  "processors": [
    {
      "script": {
        "description": "This processor duplicates the metadata of the ID into a new field within the _source field of each document",
        "lang": "painless",
        "source": """
            ctx['doc_id'] = ctx['_id'];
          """
      }
    }
  ]
}


PUT _index_template/cmdb
{
  "index_patterns": ["mdb-test*"],
  "template": {
    "settings": {
      "index.default_pipeline": "cm_metadata_fields"
    }
  }
}

However, when ingesting the data, it does not copy the _id ; it saves it as null . Is there a similar option available?

Topic		Replies	Views
Modifying document metadata in an ingest pipeline Elasticsearch	3	874	August 3, 2017
Defining different metadata "_id" field value Elasticsearch	1	354	July 31, 2018
Ingest Pipeline that extracts metadata fields into new field Elasticsearch ingest-pipeline	2	356	September 7, 2021
Bulk API ID Elasticsearch	4	1712	September 27, 2018
Indexing and "_id" question Elasticsearch	6	1537	July 6, 2017

Duplicate metadata from _id into a new field

Related topics