Painless scripting - Using match_phrase along with must_not.exists

Hi,
I am currently using painless scripting to parse through a message field, extract the required information and create a new field consisting of only that information. The issue I'm facing is that if the query is hit multiple times, the field in each of the documents gets updated every time.
I wish to first check whether the field exists, and only update the document if it doesn't exist.

POST /index_name-DATE/doc/_update_by_query
{
  "query": {
    "bool": {
      "must_not": {
        "exists": {
          "field": "loginID"
        }
      }
    },
    "match_phrase": {
      "m": "\"|1000|||1|1|0|\""
    }
  },
  "script": {
    "lang": "painless",
    "source": "ctx._source.loginID = /.*\\| LOGIN_ID=(\\w{0,50})\\|.*/.matcher(ctx._source.m).replaceAll('$1')"
  }
}

I am unable to use both the required conditions together (must_not and match_phrase) and it is giving the following error message:

{
  "error": {
    "root_cause": [
      {
        "type": "parsing_exception",
        "reason": "[match_phrase] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
        "line": 1,
        "col": 53
      }
    ],
    "type": "parsing_exception",
    "reason": "[match_phrase] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
    "line": 1,
    "col": 53
  },
  "status": 400
}

Is there a way to achieve this?

You can't just stick multiple queries inside a "query" clause like that. You need to combine them using a compound query like a bool query.

In this case, you could wrap your match_phrase in a filter clause of the bool query you already have. The following should work:

POST /index_name-DATE/doc/_update_by_query
{
  "query": {
    "bool": {
      "must_not": {
        "exists": {
          "field": "loginID"
        }
      },
      "filter": {
        "match_phrase": {
          "m": "\"|1000|||1|1|0|\""
        }
      }
    }
  },
  "script": {
    "lang": "painless",
    "source": """ctx._source.loginID = /.*\| LOGIN_ID=(\w{0,50})\|.*/.matcher(ctx._source.m).replaceAll('$1')"""
  }
}

Hey,

You saved my day ! @abdon Thanks for the help :tiger:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.