Use conditional if in a Set processor

Hi there!

Title is kinda explicative. How do I use a set processor in an ingest pipeline only if another field exists?

I mean, I want to copy the value of field_A into another field_B only if field_A does exist.
Guess it can be accomplished since there's the 'if' parameter among the Set Processor ones, but I do not know the correct syntax to use it.

Also, field_A is a nested field.

Ideas?

Hey,

try using the if field in the processor

{
  "set": {
    "if": "ctx.foo == 'someValue'",
    "field": "found",
    "value": true
  }
}

See https://www.elastic.co/guide/en/elasticsearch/reference/7.1/ingest-processors.html

Hi!

Thank you so much. It works apparently even though I cannot access nested fields with the same syntax I use in the conditional if of the script processor.

I mean, what if I had to do something like:

`"if": "ctx[outer_obj].nested_field == 'someValue'"`

try ctx.outer_obj.nested_field or ctx["outer_obj"].nested_field

I think I tried both with no success.

Anyway, I'll try them again ASAP and let you know. The info might help someone else!

please provide a full blown example using the simulate endpoint and we can try to help.

1 Like

Ok I tested it and both your solutions worked like a charm. Maybe I was working with a wrong input (field containing a dot in its name without being a nested object).

Anyway, I'll share my test (where I included several combinations) so it may help somebody with the syntax:

POST _ingest/pipeline/_simulate
{
  "docs": [ 
    { "_source" : { "outer": {"inner": "X"} } }, 
    { "_source" : { "outer": {"inner": "Y"} } }, 
    { "_source" : { "outer": {"inner": "Z"} } }, 
    { "_source" : { "outer": {"inner": "T"} } } 
    ],
  "pipeline": {
    "processors": [
      {
        "set": {
          "if": "ctx.containsKey('outer') && ctx['outer'].containsKey('inner') && ctx.outer.inner == 'Y'",
          "field": "added_field",
          "value": "added_value"
        }
      }
    ]
  }
}

RESPONSE:

{
  "docs" : [
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_type",
        "_id" : "_id",
        "_source" : {
          "outer" : {
            "inner" : "X"
          }
        },
        "_ingest" : {
          "timestamp" : "2019-06-12T08:24:23.360Z"
        }
      }
    },
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_type",
        "_id" : "_id",
        "_source" : {
          "outer" : {
            "inner" : "Y"
          },
          "added_field" : "added_value"
        },
        "_ingest" : {
          "timestamp" : "2019-06-12T08:24:23.360Z"
        }
      }
    },
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_type",
        "_id" : "_id",
        "_source" : {
          "outer" : {
            "inner" : "Z"
          }
        },
        "_ingest" : {
          "timestamp" : "2019-06-12T08:24:23.360Z"
        }
      }
    },
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_type",
        "_id" : "_id",
        "_source" : {
          "outer" : {
            "inner" : "T"
          }
        },
        "_ingest" : {
          "timestamp" : "2019-06-12T08:24:23.360Z"
        }
      }
    }
  ]
}

Thank you so much for your precious help!

Just to cover all possibilities, what if I wanted to access a non-nested fields with dots in its name inside the "field" part? Let's suppose I'm in a situation where if a field inner nested in a object outer is present, I want to add an additional field fake.nested.field.

If a try this:

POST _ingest/pipeline/_simulate
{
  "docs": [ 
    { "_source" : { "outer": {"inner": "X"} } }, 
    { "_source" : { "outer": {"inner": "Y"} } }, 
    { "_source" : { "outer": {"inner": "Z"} } }, 
    { "_source" : { "outer": {"inner": "T"} } } 
    ],
  "pipeline": {
    "processors": [
      {
        "set": {
          "if": "ctx.containsKey('outer') && ctx['outer'].containsKey('inner')",
          "field": "fake.nested.field",
          "value": "fake_nested_value"
        }
      }
    ]
  }
}

I'd end up with documents like the following:

{
      "doc" : {
        "_index" : "_index",
        "_type" : "_type",
        "_id" : "_id",
        "_source" : {
          "outer" : {
            "inner" : "X"
          },
          "fake" : {
            "nested" : {
              "field" : "fake_nested_value"
            }
          }
        },
        "_ingest" : {
          "timestamp" : "2019-06-12T09:35:05.424Z"
        }
      }
    }

As you can see, the fake.nested.field is not so fake.

How can I add "fake.nested.field": "fake_nested_value"?

Thanks!

One last nit. No need to hassle with containsKey, just use

"if": "ctx?.outer?.inner == 'Y'"

I would generally not recommend using dots in field names, cause of its ambiguity when specifying inner fields. I think scripting like this could work though

POST _ingest/pipeline/_simulate
{
  "docs": [ 
    { "_source" : { "outer": {"inner": "X"} } }, 
    { "_source" : { "outer": {"inner": "Y"} } }, 
    { "_source" : { "outer": {"inner": "Z"} } }, 
    { "_source" : { "outer": {"inner": "T"} } } 
    ],
  "pipeline": {
    "processors": [
      {
        "script": {
           "source": "ctx['i.am.sooooo.nested'] = true"
        }
      }
    ]
  }
}

Thanks for the first tip!

Regardind the fields with dots in the name, I always try not to use them but sometimes they're added by external applications or somebody who does something wrong adds them, so I'd like to learn how to handle them.

I knew I could handle them via script but what if I wanted to remove that field rather than add it?
In that case I'd need a remove processor so I'd fall back on the previous syntax.

Or is there another way to remove a fake nested field (maybe via script as well)?

Thanks!

you can also remove fields in a script. ctx.remove("field.with.dots") should do the trick