Access and match a subfield (nested field) in a Painless script with the reindex API

I have to change type (reindex) a field "source.ip" from the text mapping type to the IP type.
Because some of the "source.ip" values have non IP format (like "bob_IP").

I've tried:

POST _reindex
{
    "source": {
        "index": "p-aggtest-2020.02.29"
    },
    "dest": {
        "index": "p-aggtest-2020.02.29-delme"
    },
    "script": {
        "inline": "
             if ( !( ctx._source.source.ip.keyword =~ \/\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\/) ) { 
                   ctx._source.source.ip_name = ctx._source.remove(\"source.ip\")
             }"
    }
}

But the error is:

"error" : {
    "root_cause" : [
      {
        "type" : "script_exception",
        "reason" : "runtime error",
        "script_stack" : [
          "if ( !( ctx._source.source.ip.keyword =~ /\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}/) ) { ",
          "                          ^---- HERE"
        ],
        "script" : "if ( !( ctx._source.source.ip.keyword =~ /\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}/) ) { ctx._source.source.ip_name = ctx._source.remove(\"source.ip\") }",
        "lang" : "painless"
      }
    ]

I've tried different combinations of ctx._source.*, params._source*, doc['source'].ip and readed dosens of docs and examples without success.

I would appreciate any ideas.

Thanks for the attention!

Best regards,

Serg

It seems i got it.

POST _reindex
{
    "source": {
        "index": "p-aggtest-2020.02.29"
    },
    "dest": {
        "index": "p-aggtest-2020.02.29-delme"
    },
    "script": {
        "inline": "if ( !( ctx._source['source.ip'] != null && ctx._source['source.ip'] =~ \/\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\/) ) { ctx._source['source.ip_name'] = ctx._source.remove(\"source.ip\") }"
    }
}

But in the dest index the quantity of docs is less than in the source index:

I've added to the script else { ctx.op = 'noop' } and checked the dest index. All is OK. The new field source.ip_name is created and the incorrect field source.ip is removed. But I can't understand how i lost the rest documents during reindexing.

The error was in the mapping conflict in another field. I saw it in the bash script only.
For thouse who will stuck on the similar problem the full bash script is:

#!/bin/bash -x
while read index
do
  curl --noproxy '*' -k -u user:password -HContent-Type:application/json -XPOST localhost:9200/_reindex?pretty -d'{
    "source": {
      "index": "'$index'"
    },
    "dest": {
      "index": "'$index'-00001"
    },
    "script": {
       "lang": "painless",
       "inline": " if ( !( ctx._source[\"source.ip\"] != null && ctx._source[\"source.ip\"] =~ \/\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\/) ) { ctx._source[\"source.ip_name\"] = ctx._source.remove(\"source.ip\") } if ( !( ctx._source[\"destination.ip\"] != null && ctx._source[\"destination.ip\"] =~ \/\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\/) ) { ctx._source[\"destination.ip_name\"]  = ctx._source.remove(\"destination.ip\") } "
    }
  }'
done

Don't use curl -k option unless you are really need it.

Good luck!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.