Ingest: transforming multiple values in an array

HI @nemhods ,

Welcome to the Elastic community. I think yes you have to go with script processor, Since this requirement looks more custom.

Below is something which worked for me.

Create a Pipeline

PUT _ingest/pipeline/test-p1
{
  "description": "Convert email addresses to both full and username format",
  "processors": [
    {
      "script": {
        "source": """
          ctx.related.tmp_user = new ArrayList();
          for (int i = 0; i < ctx.related.user.size(); i++) {
            def email = ctx.related.user[i];
            def username = email.splitOnToken('@')[0];
            ctx.related.tmp_user.add(username);
            ctx.related.tmp_user.add(email);
          }
          ctx.related.user = ctx.related.tmp_user;
          ctx.related.remove('tmp_user');
        """,
        "lang": "painless"
      }
    }
  ]
}

Index sample data

POST test-index1/_doc?pipeline=test-p1
{
  "related":{
    "user":[
      "test@domain.com",
      "test1@domain1.com",
      "test2@domain2.com"
    ]
  }
}

Output

GET test-index1/_search

Docs

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "test-index1",
        "_id": "KuIriosBa7YTodDiw2wW",
        "_score": 1,
        "_source": {
          "related": {
            "user": [
              "test",
              "test@domain.com",
              "test1",
              "test1@domain1.com",
              "test2",
              "test2@domain2.com"
            ]
          }
        }
      }
    ]
  }
}
1 Like