How to build a chart based on an advanced query?


(Rodolphe Aoustin) #1

Hi,

is it possible to build a chart based on any query assuming it is valid and provides an output when you test it with the Kibana DevTool tab?

I'm not talking about basic query (filter), I'm talking about a complex query that would include aggregates, and calculate ratio between buckets.

it's not something I can do by configuring a Chart or not even with timelion (eg. timelion does not support .divide with split). the UI just not allow this setting.
I thought it would be easy for Kibana to build a chart based on a copy/paste of a query but I didn't see any "query" input field (except the filter attribute but it is limited to basic query, aggregates are not supported).

If not possible out of the box, is it part of a roadmap or is there any complexity that just prevent this? is there any plugin?

thanks,
Rod
_______ UPDATE _________
I added a full working sample and ES query in a next message below to better understand the query that I would like to visualize as a chart


(Mark Walkom) #2

If you can do things with scripted fields you might be able to do some of this. There is also the advanced json input you can add to in each visualisation.

Otherwise Kibana is built on the use of the aggregations you can select when building visualisations.

What sort of thing are you looking to create exactly?


(Rodolphe Aoustin) #3

Can I add aggs/bucket to the JSON input field? or only basic things like ""script" : "doc['grade'].value * 1.2""

Well I'm not sure I will be able to explain my need, I make a try.
Let's say I do collect logs from my applications.
All applications have a version number (v1, v2, etc)
I want to track how many time my application crashes, and I want to see that per version.
Well this is basic, I just filter my event (APP_CRASH event), and aggregate per version number.

but now I have another requirement, I want to make sure that the number of crash decreases with newer version.
it means I increase the quality of my delivery.

I can't rely on number of crash only, maybe v3 has only 200 crashes, while v2 has 400 (thus v3 seems to have less crash), but maybe I have only 1000 deployments of v3 while I have 200 000 deployments of v2. Thus v2 is actually much better!
So I need to calculate a ratio (number of crash per version/number of deployment per version). And I need this per day.
so based on my examples I need to see the folowing per day and per version:
V2:400/200000
v3:200/1000
Note: of course number of deployment can change on a daily basis.

One of the issue was to know how many deployments per day we have. So we updated the application so every instance logs 1 entry per day (eg: 'APP_VERSION, V1"). Counting this number of event per version and per day gives us the information.

We spent days and days to figure out how to visualize this. We have built the ES query (with aggs, sub aggs and buckets), but there is no means to do that with any chart nor timelion nor any other kibana plugin that we tested.
With timelion we can do it but for 1 version at a time only, you can't do a split then a divide if the split has multiple groups. So we would need to build the same diagram every time there is a new version.

On the other hand what I said above is just a simple view of my world, i'm tracking version number, but I'm tracking also other attributes and they increase much faster than version number, so building a report for each attribute is not a viable solution

Tomorrow I will build data based on the fake example I described and build a ES query so it will be easier to explain my problem and someone to help.

thanks,
Rod


(Romain Xie) #4

Yeah, I just have the same problem in my job.
We collected some network traffic data from our machines. And want to show a table list to display in/out traffic (not byte) and other information for each one in order by traffic.
In the Data Table, we can not cale the traffic. And in the Top N, we can not show other informations. In the Marketdown, we can not order the data.
Is there any idea?

I will edit the Saved Objects directly to try it, tomorrow, if not other way easier.


(Rodolphe Aoustin) #5

OK I made very basic data so you can understand better.
here is the data that I imported with FileBeat (date, EventCode,Version):

2017/10/03 07:07:02,APP_VERSION,v1
2017/10/03 07:08:02,APP_VERSION,v1
2017/10/03 07:11:02,APP_VERSION,v2
2017/10/03 07:13:02,APP_CRASH,v1
2017/10/03 07:15:02,APP_CRASH,v2

2017/10/04 07:00:02,APP_VERSION,v1
2017/10/04 07:01:02,APP_VERSION,v1
2017/10/04 07:02:02,APP_VERSION,v2
2017/10/04 07:03:02,APP_VERSION,v2
2017/10/04 07:04:02,APP_CRASH,v1
2017/10/04 07:05:02,APP_CRASH,v1
2017/10/04 07:06:02,APP_CRASH,v2

2 days of data.
the 3rd: 2 instances of V1 with 1 crash (so 50%), 1 instance of V2 with 1 crash (so 100%)
the 4th: 2 instances of V1 with 2 crash (so 100%), 2 instance of V2 with 1 crash (so 50%)

This is the ES query that works:

POST filebeat-*/_search
{
  "size":0,
  "aggs": {
    "per_day": {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "day"
      },
      "aggs": {
        "per_Version": {
          "terms": {
            "field": "Version.keyword"
          },
          "aggs": {
            "crash": {
               "filter": {
                "term": {
                  "EventCode.keyword": "APP_CRASH"
                }
               },
			         "aggs": {
                "nb_crash": {
                  "value_count": {
                    "field": "EventCode.keyword"
                  }
                 }
               }
            },
            "app_deployed": {
              "filter": {
                "term": {
                  "EventCode.keyword": "APP_VERSION"
                }
              },
		      	  "aggs": {
                "nb_app": {
                  "value_count": {
                    "field": "EventCode.keyword"
                  }
                }
              }
            },
            "percentage": {
              "bucket_script": {
                "buckets_path": {
                  "b_crash": "crash>nb_crash",
                  "b_app_deployed": "app_deployed>nb_app"
                },
                "script": "(params.b_crash / params.b_app_deployed) * 100"
              }
            }
          }
        }
      }
    }
  }
}

and the output that I would like to see as a chart:

{
  "took": 12,
  "timed_out": false,
  "_shards": {
    "total": 10,
    "successful": 10,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 12,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "per_day": {
      "buckets": [
        {
          "key_as_string": "2017-10-03T00:00:00.000Z",
          "key": 1506988800000,
          "doc_count": 5,
          "per_Version": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "v1",
                "doc_count": 3,
                "app_deployed": {
                  "doc_count": 2,
                  "nb_app": {
                    "value": 2
                  }
                },
                "crash": {
                  "doc_count": 1,
                  "nb_crash": {
                    "value": 1
                  }
                },
                "percentage": {
                  "value": 50
                }
              },
              {
                "key": "v2",
                "doc_count": 2,
                "app_deployed": {
                  "doc_count": 1,
                  "nb_app": {
                    "value": 1
                  }
                },
                "crash": {
                  "doc_count": 1,
                  "nb_crash": {
                    "value": 1
                  }
                },
                "percentage": {
                  "value": 100
                }
              }
            ]
          }
        },
        {
          "key_as_string": "2017-10-04T00:00:00.000Z",
          "key": 1507075200000,
          "doc_count": 7,
          "per_Version": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "v1",
                "doc_count": 4,
                "app_deployed": {
                  "doc_count": 2,
                  "nb_app": {
                    "value": 2
                  }
                },
                "crash": {
                  "doc_count": 2,
                  "nb_crash": {
                    "value": 2
                  }
                },
                "percentage": {
                  "value": 100
                }
              },
              {
                "key": "v2",
                "doc_count": 3,
                "app_deployed": {
                  "doc_count": 2,
                  "nb_app": {
                    "value": 2
                  }
                },
                "crash": {
                  "doc_count": 1,
                  "nb_crash": {
                    "value": 1
                  }
                },
                "percentage": {
                  "value": 50
                }
              }
            ]
          }
        }
      ]
    }
  }
}

and the logstash config:

grok {
          match => { "message" => "(?<Time>[^,]*),(?<EventCode>[^,]*),(?<Version>[^,]*)" }
  }
    date {
        locale => "fr"
        match => ["Time", "YYYY/MM/dd HH:mm:ss"]
        timezone => "Europe/Vienna"
        target => "@timestamp"
        add_field => { "debug" => "timestampMatched"}
   }

thanks for your help,
Rod


(Yohann Nicolas) #6

Hi @warkolm,

Any additional insights on the mentioned issue in that thread?
I am struggling as well on the exact same problematic.

Thanks
Yohann


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.