Painlessly turning a string of multiple numbers into multiple fields of numbers

Hey all,

I'm ingesting JSON formatted logs from nginx. The JSON is all ECS style nested fields.

I've currently got a need to visualize the upstream response time, http.upstream.response.time. Fundamentally this is a float/double (0.166, 10.0, etc). In some simple tests I succeeded with the following.

PUT logs-my-datastream/_mapping
{
  "runtime": {
    "http.upstream.response.time_int": {
      "type": "double",
      "script": {
        "source": "if (doc.containsKey('http.upstream.response.time')) { if (doc['http.upstream.response.time'].size() != 0 && doc['http.upstream.response.time'].value.length() > 0) {emit(Double.parseDouble(doc['http.upstream.response.time'].value)) } }"
      }
    }
  }
}

The problem I've got is that http.upstream.response.time is quite literally a string, not a text representation of a number an actual string.
It is expected behavior for nginx to try more than one upstream, in such cases it returns a comma separated list of the time taken for each of the upstreams tried, i.e. "10.000, 0.166" (where 10 = the timeout communicating with the upstream).

My only sensible idea here to have a meaningful value is to convert each number then add them all up, as that is the time the client sees, but I'm struggling to understand how to do that. My best attempt so far results in;

class_cast_exception: Cannot cast from [java.lang.Double] to [double].
if (doc.containsKey('http.upstream.response.time')) {
  if (doc['http.upstream.response.time'].size() != 0 && doc['http.upstream.response.time'].value.length() > 0) {
    String[] times = doc['http.upstream.response.time'].value.split(", ");
    def total=0.0;
    for(Double time:times){
      total = total + (double)time;
    }
    emit(total)
  }
}

I am in no way a Java developer at all, today is literally the first day I have written any, so please don't be too horrified at my potentially horrible java.
I've tried various combinations of Double, double, (double), String and nothing, the best I can do is reverse the order of [java.lang.Double] and [double] in the cast exception.

If someone has a better idea for doing this I'd love to hear, if not I'd be super appreciative of help fixing the java/painless.

Thanks
Mike

Hi,
here is something you can do. I do not guarantee that it is painlessly readable , but it should do the trick

GET logtest/_search
{
  "runtime_mappings": {
    "yournewfield": {
      "type": "double",
      "script": {
        "source": """
        if (doc.containsKey('http.upstream.response.time')) {
          if (doc['http.upstream.response.time.keyword'].size() != 0 && doc['http.upstream.response.time.keyword'].value.length() > 0) {
            def res = Arrays.asList(doc['http.upstream.response.time.keyword'].value.splitOnToken(",")).stream()
                  .map(String::trim)
                  .mapToDouble(x -> Double.parseDouble(x))
                  .sum();
            emit(res);
          }
        }
        """
      }
    }
  },
  "fields": [
    "yournewfield"
  ]
}

Reading again your code, you were quite close, just missing an explicit cast.

Thanks Vincent.

I have to say I don't understand your code at all, but with some tiny modification the below worked great.

if (doc.containsKey('http.upstream.response.time')) {
  if (doc['http.upstream.response.time'].size() != 0 && doc['http.upstream.response.time'].value.length() > 0) {
    def res = Arrays.asList(doc['http.upstream.response.time'].value.splitOnToken(",")).stream()
      .map(String::trim)
      .mapToDouble(x -> Double.parseDouble(x))
      .sum();
    emit(res);
  }
}

Using .keyword just doesn't work with this data set, it always fails somewhere. http.upstream.response.time is fixed by mapping as a keyword, but can be numbers as a string, a single "-", or an empty string, maybe that throws off elasticsearch? Whatever, the above works. Thanks.

You mentioned I was quite close, just missing an explicit cast. Out of curiosity, what cast was I missing?

Cheers
Mike

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.