Skip the document using a script processor

My use-case : Before the document is indexed, I would like to Skip the documents with conditions using a script processor (defined in a pipeline). When I try to create a index using the pipeline with the script processor a null-pointer exception is thrown. Any pointers would be of great help.

Created a pipeline with script processor

    PUT _ingest/pipeline/testindex
    {
      "description": "pipeline for filters",
      "processors": [
        {
          "script": {
            "on_failure": [
              {
                "fail": {
                  "message": "failed to parse"
                }
              }
            ],
            "source": "ctx.actor_id = ( (java.lang.Double.parseDouble(ctx.actor_id+'') < java.lang.Double.parseDouble(3+''))) ? ctx.actor_id:Exception()",
            "lang": "painless"
          }
        }
      ]
    }

I get a error as below

    {
      "error": {
        "root_cause": [
          {
            "type": "script_exception",
            "reason": "compile error",
            "processor_type": "script",
            "script_stack": [
              "...  ? ctx.actor_id:Exception()",
              "                             ^---- HERE"
            ],
            "script": "ctx.actor_id = ( (java.lang.Double.parseDouble(ctx.actor_id+'') < java.lang.Double.parseDouble(3+''))) ? ctx.actor_id:Exception()",
            "lang": "painless"
          }
        ],
        "type": "script_exception",
        "reason": "compile error",
        "processor_type": "script",
        "script_stack": [
          "...  ? ctx.actor_id:Exception()",
          "                             ^---- HERE"
        ],
        "script": "ctx.actor_id = ( (java.lang.Double.parseDouble(ctx.actor_id+'') < java.lang.Double.parseDouble(3+''))) ? ctx.actor_id:Exception()",
        "lang": "painless",
        "caused_by": {
          "type": "illegal_argument_exception",
          "reason": "invalid sequence of tokens near ['('].",
          "caused_by": {
            "type": "no_viable_alt_exception",
            "reason": null
          }
        }
      },
      "status": 400
    }

I am using elasticsearch 7.4.2
I tried drop processor but i am facing issue for Date parsing.

Hello @bindu

You definitely need to use the drop processor.
I can help but I would like to understand what is the condition to drop the event.

If I correctly understood, you want to keep events when actor_id is < 3.
The field actor_id is a string or a number?
Do you have a sample document?

Thank you @Luca_Belluccini for the quick response.

I used drop processor and able to create the pipeline. I can prevent the document from getting indexed based on conditions using properties with type String or number.
I could not do same using Date type property.

Below is the document:

{
  "_index": "testindex",
  "_type": "_doc",
  "_id": "tFfyynEB8HYU6M9N_PZR",
  "_version": 1,
  "_score": 1,
  "_source": {
    "last_update": "2020-03-20T18:16:03.000Z",
    "last_name": "GUINESS",
    "actor_id": 1,
    "addr": "addr",
    "first_name": "PENELOPE"
  }
},
{
  "_index": "testindex",
  "_type": "_doc",
  "_id": "tVfyynEB8HYU6M9N_PZR",
  "_version": 1,
  "_score": 1,
  "_source": {
    "last_update": "2020-03-28T18:16:08.000Z",
    "last_name": "WAHLBERG",
    "actor_id": 2,
    "addr": "hyd",
    "first_name": "NICK"
  }
},
{
  "_index": "testindex",
  "_type": "_doc",
  "_id": "tlfyynEB8HYU6M9N_PZR",
  "_version": 1,
  "_score": 1,
  "_source": {
    "last_update": "2020-03-08T18:16:13.000Z",
    "last_name": "CHASE",
    "actor_id": 3,
    "addr": "chn",
    "first_name": "ED"
  }
}

I need to use last_update( "type": "date") to drop the documents with condition last_update <= 03/20/2020.

For that I am using the condition is

PUT_ingest/pipeline/testindex{
  "description": "pipeline for  filters",
  "processors": [
    {
      "drop": {
        "if": "(java.text.SimpleDateFormat('MM/dd/yyyy').parse(' 03/20/2020')<=java.text.SimpleDateFormat('yyyy-MM-dd').parse(ctx.last_update))"
      }
    }
  ]
}

Pipeline is created successfully. After applying this pipeline to reindex, i am getting the following error:

{
      "index": "testindex",
      "type": "_doc",
      "id": "91eqyXEB8HYU6M9N0PEW",
      "cause": {
        "type": "exception",
        "reason": "java.lang.IllegalArgumentException: GeneralScriptException[Failed to compile inline script [(new java.text.SimpleDateFormat('MM/dd/yyyy').parse('03/20/2020')<=new java.text.SimpleDateFormat('yyyy-MM-dd').parse(ctx.last_update))] using lang [painless]]; nested: CircuitBreakingException[[script] Too many dynamic script compilations within, max: [75/5m]; please use indexed, or scripts with parameters instead; this limit can be changed by the [script.max_compilations_rate] setting];",
        "caused_by": {
          "type": "illegal_argument_exception",
          "reason": "GeneralScriptException[Failed to compile inline script [(new java.text.SimpleDateFormat('MM/dd/yyyy').parse('03/20/2020')<=new java.text.SimpleDateFormat('yyyy-MM-dd').parse(ctx.last_update))] using lang [painless]]; nested: CircuitBreakingException[[script] Too many dynamic script compilations within, max: [75/5m]; please use indexed, or scripts with parameters instead; this limit can be changed by the [script.max_compilations_rate] setting];",
          "caused_by": {
            "type": "general_script_exception",
            "reason": "Failed to compile inline script [(new java.text.SimpleDateFormat('MM/dd/yyyy').parse('03/20/2020')<=new java.text.SimpleDateFormat('yyyy-MM-dd').parse(ctx.last_update))] using lang [painless]",
            "caused_by": {
              "type": "circuit_breaking_exception",
              "reason": "[script] Too many dynamic script compilations within, max: [75/5m]; please use indexed, or scripts with parameters instead; this limit can be changed by the [script.max_compilations_rate] setting",
              "bytes_wanted": 0,
              "bytes_limit": 0,
              "durability": "TRANSIENT"
            }
          }
        },
        "header": {
          "processor_type": "conditional"
        }
      },
      "status": 500
    }

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.