Sum / Aggregations - Kibana/Logstash

I am working on a project where the log data can be used to determine a user's performance everyday. The work users do is converting physical documents to digital ones.

I have successfully loaded the data into elasticsearch via logstash but I am facing a small problem.
To measure a user's performance there are 3 main fields we're targeting - pageCount, docCount and imageCount (pages, docs and images scanned by a user for a given job)

The logs catch the doc-count and page-count as one individual field. However, the image count is broken down into 6 - colorFront, colorRear, blackandwhiteFront, blackandwhiteRear, grayscaleFront and grayscaleRear (it just categorizes the scanned images on their physical state - so if a image is color, it'll be logged under the color front and rear)

What I am trying to do is - make a new field -- ImageCount which adds the values of all the 6 sub-fields into one.

I have looked into scripted fields to tackle this but no success there. How can I add a new field ImageCount which adds all the 6 fields for a given record in the log?

For Reference:
I tried this method in the scripted fields - Adding a new column to a data table from a calculation based on two different columns in the same row
but when I do this, in the discover tab all other fields disappear?

Can you show exactly what you are doing? Summing up your 6 fields into one using a scripted field sounds like the right approach and it shouldn't cause other fields to disappear.

A common error is to not accomodate for missing values which can cause exceptions when querying data.

Please copy/paste the mapping of your index and the code of your scripted field.

Also, which version of the stack are you running?

The count of these 3 variables (page, image, doc) can be used to measure an operator's performance.

    {
      "kcpvx-2020.09.28" : {
        "mappings" : {
          "properties" : {
            "@timestamp" : {
              "type" : "date"
            },
            "@version" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "batchName" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "blackwhiteFrontImgs" : {
              "type" : "long"
            },
            "blackwhiteRearImgs" : {
              "type" : "long"
            },
            "colorFrontImgs" : {
              "type" : "long"
            },
            "colorRearImgs" : {
              "type" : "long"
            },
            "deletedImgs" : {
              "type" : "long"
            },
            "docCounts" : {
              "type" : "long"
            },
            "documentID" : {
              "type" : "long"
            },
            "grayScaleFrontImgs" : {
              "type" : "long"
            },
            "grayScaleRearImg" : {
              "type" : "long"
            },
            "host" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "imageID" : {
              "type" : "long"
            },
            "jobName" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "jobRootDirectory" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "logtimestamp" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "message" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "operationID" : {
              "type" : "long"
            },
            "operationMsg" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "operator" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "pageCounts" : {
              "type" : "long"
            },
            "pageID" : {
              "type" : "long"
            },
            "path" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "workStationName" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            }
          }
        }
      }
    }

I used the following script based on the reference I have posted.

def imageCount= doc['blackwhiteFrontImgs'].value + doc ['blackwhiteRearImgs'].value + doc['colorFrontImgs].value + doc['colorRearImgs'].value + doc['grayScaleFrontImgs].value + doc['grayScaleRearImgs'].value; 
return imageCount;

Also I am using v7.9.2 of the stack

In your script you are accessing grayScaleRearImgs, but in the mapping it's called grayScaleRearImg - is this a typo?

If it doesn't help fixing it:

Can you explain what exactly is happening (preferably with screenshots)? What do you see without the scripted field? What do you see with the scripted field? What do you expect to see?

In your script you are not checking whether the fields you are accessing actually have values - if there is just a single document with one of these fields missing, your search will fail. To prevent it, make sure in your script you are only accessing values that contain data: if (doc['grayScaleRearImgs'].size() != 0) { ...

I am sorry about the typo, I must have made that when I was making the post. I didn't save the script in my index yet

So without the scripted field I see all the operations that are carried out by the users when indexing documents. What I am trying to do is, as I can see a doc or page count for each operation, I want to be able to aggregate the image count by combining the 6 fields into one. I am afraid I can't share a lot of details but the following is a log example so you can see at the end of a batch it records how many documents, pages and images were scanned

2020-09-30 09:36:59;8195;Close batch;***;Capture_1;****;***;***;***;***;***;ColorFrontImgs=0,ColorRearImgs=0,BlackwhiteFrontImgs=17,BlackwhiteRearImgs=2,GrayScaleFrontImgs=0,GrayScaleRearImg=0,DocCounts=1,PageCounts=17,DeletedImgs=0

Also I will try that and see if it works.
** also the variable names are slightly different, they're changed when processing through logstash

okay so I got the scripted field working perfectly. I used the following code which adds imageCount.

After making the field to verify, I went and searched imagecount>0 and the results are correct.

if (!doc.containsKey('colorFrontImgs') || doc['colorFrontImgs'].empty || !doc.containsKey('colorRearImgs') || doc['colorRearImgs'].empty || !doc.containsKey('blackwhiteFrontImgs') || doc['blackwhiteFrontImgs'].empty || !doc.containsKey('blackwhiteRearImgs') || doc['blackwhiteRearImgs'].empty || !doc.containsKey('grayScaleFrontImgs') || doc['grayScaleFrontImgs'].empty || !doc.containsKey('grayScaleRearImg') || doc['grayScaleRearImg'].empty) {
    return ""
}
else {
    return doc['colorFrontImgs'].value + doc['colorRearImgs'].value + doc['blackwhiteFrontImgs'].value + doc['blackwhiteRearImgs'].value + doc['grayScaleFrontImgs'].value + doc['grayScaleRearImg'].value;
}

Glad to hear it worked out! For the next time, if you are editing the scripted field, you can click the "Get help" link above the save button and switch to the "Preview results" tab to test-run your script before saving.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.