Working on "values" in field to return result

I'm trying to build a search script that returns true or false based on a strange criteria. One of my fields is a JSON array of strings. For particular records, the first string represents an integer which is a bitfield. I need to 1) get the first element as a string, 2) convert it to an integer, and 3) compare it with a bit index by shifting or masking. Can I do all of this within a painless script, embedded in my query? The API documentation makes me think this is possible, but I can't find a single example which even puts me on paper with this.

The "compiler" chokes on the first line of my attempt:

def jsonArray = new JSONArray(doc['Tunings.Data'].value);

It doesn't like JSONArray, which I thought would be available. If I try something like:

def vals = new String[] { doc['Tunings.Data'].value };

This passes the compiler, but I can't figure out what I actually have captured in the vals variable. I can't figure out how to inspect it. Is it a string? Has it been interpreted as an array? Nothing I try to compare it to equates to true.

So, just as just the first step, how would I get the first value of a JSON-array-formatted field like this?

"["14765"]"

Note that this evaluates to true, and returns documents:

def vals = doc['Tunings.Data'].value;
return vals == \"[\\\"3\\\"]\";

So I'm looking for something like this, but the exact syntax eludes me:

def vals = new String[] { doc['Tunings.Data'].value };
return vals[0] == \"3\";

I apparently don't have JSONArrary... Maybe I can just gsub the braces away, and deal with a single string (for my particular case)...

This works:

def val = doc['Tunings.Data'].value;
def str = val.replace('[', '').replace(']', '').replace('\"', '');
return str == '3';

So now I understand that Elasticsearch is doing no interpretation of double quotes as indicating strings. Is there any way to get Elasticsearch (within or without Painless) to understand a string field formatted as follows as an array of strings? Like:

["0"],["1"],["2"]

the data returned by .value is already a list, so you get access the first element via .get(0). you can use Integer.valueOf() to convert it to an int, but then you need to check if the code you want to execute is allowed within painless (it has a restrictive white and blacklisting mechanism). You might want to check the painless docs if you can call those methods you want to call, see https://www.elastic.co/guide/en/elasticsearch/painless/7.1/painless-api-reference.html

When I try to use doc.get('Tunings.Data').value or doc.get('Tunings.Data').get(0), my later code chokes with an error like:

"lang" : "painless",
    "caused_by" : {
    "type" : "number_format_exception",
    "reason" : "For input string: \"[\"3\"]\""
}

So it doesn't seem like either one of those is interpreting my string as an array or list. I mean, I'm assuming that I've formatted my JSON string arrays correctly, with escaped double-quotes surrounding the values. An example line in one of my imported docs looks like this:

"Data": "[\"00\",\"01\",\"02\",\"03\",\"04\",\"05\"]",

What should that line look like that would allow Elastic to correctly interpret it as an array of strings, so that doc.get('Tunings.Data').get(0) gives me the string, "00"?

providing a fully fledged example for others to reproduce would be super helpful here, otherwise everything else is just guesswork.

A full example would be tough. A typical document is 20 MB, and the index schema definition is hundreds of lines long.

Ultimately, my question is pretty simple: Given a keyword field which contains a JSON representation of an array of strings, how do I get the first element of that array with Painless?

The 2 functions I've tried just seem to return the whole field, and the typical way of handling this in JSON-based types and functions in Java don't seem to be available in the API.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.