Kibana - Query difficulties in Canvas

Hello,

Right now I'm facing an issue creating a vertical bar element. The values I'm looking to use are quite simple to obtain through Dev Tools by making a query, however Canvas doesn't allow to query that way, which I find quite odd to be honest.

So I have data in an index that is a percentage of time of a process per day (process time / total time in day). My goal is to make a query in any of the forms available for Canvas that allows me to group the data by process, sum all the values from that percentage data field for each day and finally giving me the average percentage of the day taken grouped by process, having 25 processes, my goal is for the query to only present a table with the 25 processes and respective averages. My current query can do this but not quite, since I can't find a way to make the average of the time per process in the query, only when selecting the type of data in the Display->Measure. This presents a seemingly correct result, but it isn't because the query is only returning the max 65535 document count and I have many more documents for this metric. The value is different from the one using ElasticSearch query, which grabs all documents.

Is there any suggestion to make this work? I'd really appreciate it!

Kind regards,

Luís

@LeeDr can we get some inputs here please?

Thanks,
Bhavya

I'm not sure it can do your query, but Canvas does have a data source called Elasticsearch documents which has a field for query. Is that what you're trying?

The other 2 ways are Elasticsearch SQL and Timelion queries.

Hello,

Thank you for your input. Elasticsearch documents doesn't allow aggregations which are necessary for what I'm looking for, which is basically an aggregation by process, followed by an aggregation by date_histogram with a 1 day interval, followed by a sum of all values from a specific field corresponding to each day and each process. At the end I want to do the average of those sums from all the days queried, per process. Perhaps it will be clearer by presenting the query I have in JSON format:

{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "ProcessName": {
      "terms": {
        "field": "processname.keyword",
        "order": {
          "_key": "asc"
        },
        "min_doc_count": 1,
        "size": 100
      },
      "aggs": {
        "date_hist": {
          "date_histogram": {
            "field": "@timestamp",
            "calendar_interval": "day",
            "time_zone": "+01:00",
            "min_doc_count": 0
          },
          "aggs": {
            "daytimepercentage": {
              "sum": {
                "field": "orchestrator.job.daytimetakenpercentage"
              }
            }
          }
        },
        "avg_daytimepercentage": {
          "avg_bucket": {
            "buckets_path": "date_hist>daytimepercentage"
          }
        }
      }
    },
    "avg_all": {
      "avg_bucket": {
        "buckets_path": "ProcessName>avg_daytimepercentage"
      }
    }
  }
}

It could be good to have the option to create elements in canvas which gather data through Advanced JSON as I've seen in the Machine Learning module for creating the jobs if I'm not mistaken.

I'm still asking a few folks on options, but meanwhile, have you looked at Vega? It's a type of visualization in Kibana that provides tremendous flexibility. Of course that also comes with a learning curve. There are sample Vega visualizations in the sample data sets.

Hello LeeDr,

I appreciate your help!

Yes, I've looked at Vega and from what I've used in other situations I think it would work, however my goal would be for the chart to be part of a Canvas along with other metrics and for it to be changed according to Time Filter and Dropdown filter.

I should note that I'm working with Kibana v7.9.2.

Btw this is the query I have so far that gives me percentage of time per process per day:

SELECT  processname.keyword, DATETIME_FORMAT("orchestrator.job.realstarttime",'yyyy-MM-dd') as Date, SUM("orchestrator.job.daytimetakenpercentage") as PercTaken FROM "uipath*" GROUP BY "processname.keyword", DATETIME_FORMAT("orchestrator.job.realstarttime",'yyyy-MM-dd')

Only thing left is to get the average of the percentages by processname.keyword, but I can't figure out how since subqueries are limited. This would be the full query I had in mind that doesn't work:

SELECT processname.keyword, AVG(PercTaken) FROM(SELECT  DATETIME_FORMAT("@timestamp",'yyyy-MM-dd') AS Date, processname.keyword, ISNULL(SUM("orchestrator.job.daytimetakenpercentage"),0)/100 as PercTaken FROM "uipath*" GROUP BY DATETIME_FORMAT("@timestamp",'yyyy-MM-dd'), processname.keyword) GROUP BY processname.keyword

@Luis_P

I think you can accomplish this by doing some post processing in Canvas using the ply function.

Here's a simple example using the demodata function. ply splits it into datatables by username. staticColumn adds a column that has the rowCount for that table in it. Then we return the sum of the price divided by that rowCount.

filters
| demodata
| ply by="username" 
expression={ staticColumn name="count" value={rowCount} | math expression="divide(sum(price), mean(count))"}

Hopefully you can adapt something similar that will work for what you are trying to accomplish

Hi @corey.robertson

Thank you for your input!

The issue with doing some post processing in Canvas is that the query already has the 65535 rows limit, while I significantly more with my currently ESSQL query:

SELECT  processname.keyword, DATETIME_FORMAT("orchestrator.job.realstarttime",'yyyy-MM-dd') as Date, SUM("orchestrator.job.daytimetakenpercentage") as PercTaken FROM "uipath*" GROUP BY "processname.keyword", DATETIME_FORMAT("orchestrator.job.realstarttime",'yyyy-MM-dd')

This query together with selecting "Average" in the "Display" tab in the chart for the appropriate y-axis values (PercTaken in the query above), already produces the desired graph, but without the full time range, because the query returns a table limited by the 65535 rows. From what I read this is a limitation when querying to Canvas through ESSQL, so your method using ply would be applied after this limitation, not before, and in this way still having the limitation of row count.

@corey.robertson @LeeDr No updates regarding this issue? I've notice that canvas has some objects called esaggs, aggBucketAvg, aggDateHistogram, but I don't find any documentation or discussion on how to use it, I saw them as potential solutions to my issue, doing it all by hand on the Expression editor

Yeah, those functions are currently used internally by some other solutions, so they won't quite work in Canvas yet.

Would you be able to create the visualization you are looking for in Lens or another Visualization and then embed it into the Canvas Workpad? (Or was that functionality not yet available in 7.9.2?)

I can but will they be dependent of time and dropdown filter or is there a way to do it? Because from what I can tell it has an independent time filter which is not ideal...