Scripted Field Error


(Walker) #1

I'm attempted to create a scripted field that grabs the value of a field, strips off text that matches the regex, and then puts the modified value in a new field. I am using a modified example from the documentation because I don't know the language to create it myself.

ctx._source['Full URL'] = /https:\/\/example\.com\/government\/elections\//.matcher(ctx._source['Full URL']).replaceAll('')

When I preview the script, it doesn't throw errors but I also don't get any preview results. When I create the scripted field, it appears to be successful and appears in the index pattern list. Unfortunately, after that the Discover section does not show any results over any time period and Kibana says 1 of 2 shards failed (I run a single node, single shard configuration).

If I attempt to create a visualization, targeting the new field, I get the below error:

{
	"error": {
		"root_cause": [
			{
				"type": "script_exception",
				"reason": "runtime error",
				"script_stack": [
					"ctx._source['Full URL'] == /https:\\/\\/example\\.com\\/government\\/elections\\//i.matcher(ctx._source['Full URL']).replaceAll('')",
					" ^---- HERE"
				],
				"script": "ctx._source['Full URL'] == /https:\\/\\/example\\.com\\/government\\/elections\\//i.matcher(ctx._source['Full URL']).replaceAll('')",
				"lang": "painless"
			},
			{
				"type": "script_exception",
				"reason": "runtime error",
				"script_stack": [
					"ctx._source['Full URL'] == /https:\\/\\/example\\.com\\/government\\/elections\\//i.matcher(ctx._source['Full URL']).replaceAll('')",
					" ^---- HERE"
				],
				"script": "ctx._source['Full URL'] == /https:\\/\\/example\\.com\\/government\\/elections\\//i.matcher(ctx._source['Full URL']).replaceAll('')",
				"lang": "painless"
			}
		],
		"type": "search_phase_execution_exception",
		"reason": "all shards failed",
		"phase": "query",
		"grouped": true,
		"failed_shards": [
			{
				"shard": 0,
				"index": "cloudflare-2018.10.30",
				"node": "pPGsP09BSNeNANYO6A9KEQ",
				"reason": {
					"type": "script_exception",
					"reason": "runtime error",
					"script_stack": [
						"ctx._source['Full URL'] == /https:\\/\\/example\\.com\\/government\\/elections\\//i.matcher(ctx._source['Full URL']).replaceAll('')",
						" ^---- HERE"
					],
					"script": "ctx._source['Full URL'] == /https:\\/\\/example\\.com\\/government\\/elections\\//i.matcher(ctx._source['Full URL']).replaceAll('')",
					"lang": "painless",
					"caused_by": {
						"type": "null_pointer_exception",
						"reason": null
					}
				}
			},
			{
				"shard": 0,
				"index": "cloudflare-2018.10.31",
				"node": "pPGsP09BSNeNANYO6A9KEQ",
				"reason": {
					"type": "script_exception",
					"reason": "runtime error",
					"script_stack": [
						"ctx._source['Full URL'] == /https:\\/\\/example\\.com\\/government\\/elections\\//i.matcher(ctx._source['Full URL']).replaceAll('')",
						" ^---- HERE"
					],
					"script": "ctx._source['Full URL'] == /https:\\/\\/example\\.com\\/government\\/elections\\//i.matcher(ctx._source['Full URL']).replaceAll('')",
					"lang": "painless",
					"caused_by": {
						"type": "null_pointer_exception",
						"reason": null
					}
				}
			}
		]
	},
	"status": 500
}

(Nathan Reese) #2

I would recommend doing this on ingest. Scripted fields can be resource intensive.

If you are not able to update the ingest pipeline, then you can do this without REGEX, but rather just a simple indexOf call. Painless is a subset of Java. More information about the available features can be found in the documentation = https://www.elastic.co/guide/en/elasticsearch/painless/6.4/painless-api-reference.html


(Walker) #3

I don't know Painless or Java so....still looking for an answer. I can adjust the pipeline to do what I need but I have past data that I need to adjust and im running up against a deadline that a reindex doesn't really allow for.


(Walker) #4

Found an example and tailored it...but same behavior as the regex statement I was trying earlier...not sure what's wrong here, is it just syntax?

 def path = ctx._source['Full URL'].value;
if (path == "http://example.com/elections/" ) {
    int lastSlashIndex = path.lastIndexOf('/');
    if (lastSlashIndex > 0) {
    return path.substring(lastSlashIndex+1);
    }
}
return ""

(Walker) #5

Also...what is the ctx in the front of the field referencing? I haven't found an explanation of what that is and maybe it should be something different based on my setup?


(Nathan Reese) #6

I can not write the script for you. Please read the documentation and experiment with different ideas. If you run into problems, please ask questions in the forum.


(Walker) #7

.....I have two examples of things I've "written" that do not work. Instead of telling me where I'm wrong, or an explanation of what the error means, you're saying "Go learn a programming language". I'm not looking for someone to "do it for me" necessarily but someone to assist me in showing me where I'm wrong.

Thanks for the help :roll_eyes:


(Nathan Reese) #8

what is the ctx in the front of the field referencing

ctx is short for context and is the document that your script is run against.

Try building out the REGEX in a REGEX playground to get that part working first - https://regexr.com/.

I still think it would be easier to not use REGEX and just use indexOf to find where the substring starts and then use substring(resultsOfIndexOf) to remove the fixed string. Check out Java String docs for details https://docs.oracle.com/javase/7/docs/api/java/lang/String.html


(Christian Dahlqvist) #9

I do not think you can modify the _source field in a scripted field, so I suspect you need to return the value from the script which will create the field you named in the UI. If you are looking to modify the source you will need to do so through a script with the reindex API.


(Walker) #10

I'm not trying to modify the _source field, I just assumed I was entering the path to the field and it looks like all fields are under the _source field, at least when I look at the JSON for the event..I highlighted build as an example of what I believe is a subfield of _source...I'm not targeting it for any purpose.

image


(Walker) #11

That's the site I use for regex since I'm not the greatest at building them. I'm not saying your suggestion is incorrect, it's just something I don't know how to do and frankly can't make heads or tails of most examples out there. I'll take a look at the documentation you linked here.


(system) #12

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.