Renaming field during insert based on type

Could I have some guidance on the correct way to append a suffix indicating type to a field name during the insertion process?

So for example if inserting { "a": "2015-09-21"} we would create a document like {"a_date" : 2015-09-21 (as date type) }

and

{"a" : "abc"} would create a document like {"a_string" : "abc" }

Neither document would have a key for "a".

Dynamic Templates seem to get part of the way there as you can differentiate based on type and add new keys, but it looks like I can't get rid of the original key.

If I need an ingest plugin to accomplish this could someone provide a piece of code to perform a basic name transformation based on type?

I would indeed use ingest pipelines for a case like this. For your example case, you can use the date processor to extract a date and ignore_failure: true to skip the step if parsing the date fails, the set processor with a condition to copy a to a_string only if the field a_date does not exist, and finally a remove processor to remove the field a from the document.

Here's an example pipeline:

PUT _ingest/pipeline/maybe-dates
{
  "processors": [
    {
      "date": {
        "field": "a",
        "target_field": "a_date",
        "formats": ["yyyy-MM-dd"],
        "ignore_failure": true
      }
    },
    {
      "set": {
        "if": "ctx.a_date == null", 
        "field": "a_string",
        "value": "{{a}}"
      }
    },
    {
      "remove": {
        "field": "a"
      }
    }
  ]
}

And here's a call to the simulate pipeline API with a couple of example docs demonstrating that it works.

POST _ingest/pipeline/maybe-dates/_simulate
{
  "docs": [
    {
      "_source": {
        "a": "asdf"
      }
    },
    {
      "_source": {
        "a": "2019-02-15"
      }
    }
  ]
}

Which should return something like:

{
  "docs" : [
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_type",
        "_id" : "_id",
        "_source" : {
          "a_string" : "asdf"
        },
        "_ingest" : {
          "timestamp" : "2019-02-15T23:20:21.918Z"
        }
      }
    },
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_type",
        "_id" : "_id",
        "_source" : {
          "a_date" : "2019-02-15T00:00:00.000Z"
        },
        "_ingest" : {
          "timestamp" : "2019-02-15T23:20:21.918Z"
        }
      }
    }
  ]
}

If you need other types, you can do something similar with the convert processor instead of date.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.