How split function works in Timelion chart

ElasticQuestion_1234 · March 18, 2021, 11:01pm

I am trying to understand how the split function in Timelion chart works.

.es(....,split=rows:256,...)

For example, if I have 10K rows, if I split 10k, Kibana goes timeout right away. When I try split less than 256 rows, it works fine. I am OK with not viewing every points.

So my question is when we split in a number less than the total document numbers in Timelion chart, are each points on the chart reflecting an average of a group of actual points (10k/256 groups)? Or are the 256 points just random points picked from the 10K total rows? Thanks.

markov00 · March 22, 2021, 9:09am

Hi the split= operation is a term aggregation where you define the field to use and the size of how many buckets should be returned.
In your case, it looks like you want to split by field called rows and you want to get the top 256 terms on that field. For each term, timelion will create a time series chart, that probably is not what you are looking for.
if you just don't specify the split parameter, timelion aggregate your 10k points depending on your time range and date histogram interval.

ElasticQuestion_1234 · March 23, 2021, 9:26pm

Hi Marco, thanks for the reply. When I don't specify the split parameter, the value in rows will not show up. When I set split= 256, it shows 256 points distributed. As you mentioned, these are 256 buckets. I don't understand how each bucket is related to the original 10K data points. Is the original 10k data points first evenly separated to 256 buckets? Then among each bucket, the mean value is calculated or top value is picked?

markov00 · March 23, 2021, 10:08pm

Hi, can you please share the timelion script you are using and a sample of your document so I can understand what is going on?

ElasticQuestion_1234 · March 25, 2021, 8:49pm

Hi Marco, sorry for the slow response. The timelion script I use looks as follow:

.es(index="index_10k",timefield="time",metric="max:volume",split=ID:256).label().color("rgb(128,128,128)").points(show=true,fill=10,fillColor=gray)

I am trying to plot the value in volume field individually as points. I have another field called ID, which is range from 1 to 10,000. I used ID to split the values so that I can see multiple points. If I don't split, I will only see the Max volume value. I am using 256 in split because more than that will cause time-out error. I am trying to understand if these 256 points are a reasonable representation of my whole data set's distribution. How were those 256 points picked?

I hope this make sense. If I describe my data in data frame terms, I have two columns, ID and volume, with 10K rows. Thanks for the help.

markov00 · March 29, 2021, 8:50am

Actually Kibana and most visualization are not meant to be used to display single data points, but aggregations of them.
Instead of using Timelion I can suggest using Vega, this tool will allow you to directly fetch the raw data without applying a term aggregation on such level of cardinality.
Rendering a scatterplot there is relatively easy if you follow the current Vega guide Vega | Kibana Guide [7.12] | Elastic and an example here: Scatterplot | Vega-Lite

ElasticQuestion_1234 · March 29, 2021, 2:26pm

Thanks Marco. It make sense to me now. I am fine for my use case. But I will take a look at Vega as well.

system · April 26, 2021, 2:27pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Split in Timelion Kibana timelion	3	5457	August 5, 2017
Split chart in Timelion Kibana timelion	4	1993	October 4, 2019
Using Split in Timelion and "Group other values" Kibana timelion	3	1629	April 5, 2018
Timelion split and then filter on splitted values Kibana timelion	6	3954	July 8, 2018
Using split in Timelion Kibana	3	1367	June 2, 2017

How split function works in Timelion chart

Related topics