Show column as percentage of another column

I am trying to build a visualization with an aggregate percentage, and I can't find a way to show the percentages.
Each document has three fields that are relevant to my visualization:
day : date
number_of_tokens : int
tokens_matched: list of strings

I'd like to have a table with one row per day, the sum of number_of_tokens for all documents on that day, and the percentage of tokens_matched out of the sum of number_of_tokens.

For example, for these 2 documents:
{day = 2017-12-25, number_of_tokens = 10, tokens_matched = ["a", "big", "cat"]}
{day = 2017-12-25, number_of_tokens = 5, tokens_matched = ["nice", "dog"]}

The table would look like:
2017-12-25, 15, 33.3%

I can't figure out how to get the percentage column, I can get the actual number (in this case 5) by summing the size of the lists, but I can't get it as a percentage of the total column.

Any help would be greatly appreciated!
Thanks :slight_smile:

In the example result you gave, there's a column calculated as 33.3% and it looks like that comes from:

  • sum of tokens: 15 (10 + 5)
  • number of tokens matched: 5 (3 + 2)
  • percentage: 33.3 (5 out of 15)

In order to get the inputs for that calculation, I think your index needs a field for number_of_tokens_matched which will be an integer, because you won't be able to just use the length of the tokens_matched field, as far as I know.

I just wanted to give you that advice, but I'll keep exploring this question a bit to see what you can do once you have a "numerator" field for the percentage.

Hi again,

Unfortunately it looks like this might not be feasible to do as a table with the tools available in Kibana, but I think you would be able to achieve it as a line chart in either Time Series Visual Builder or Timelion.

In TSVB, you'll set up the 2 metrics to do the ratio on, and then use a Bucket Script aggregation:

For Timelion, you can refer to this discussion: Basically you'll want to use an expression like:

.es(*, timefield="day", metric=sum:number_of_tokens_matched).divide(.es(*, timefield="day", metric=sum:number_of_tokens))

Hi Tim,
Thanks for the quick response! Both options worked well, so I will go with one of them.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.