I am working on a large dataset with 20000 rows and 40 columns, I want to group by my data based on 10 categorical columns, then I want to create a histogram and percentiles graph to make a comparison between the top 2 groups based on a count
My understanding is that you are after creating a dashboard containing a couple of panes:
- A table of 10 columns
- A histogram
- A percentiles graph
Is that correct?
NB: You can sort your buckets (groups) by the top count using the Sort property in the bucket's options.
Thank you for your response, Yes I want to create a dashboard where I want two visualizations
- Percentiles Graph
But for each visualization, I want to group by my data based on 10 categorical columns region, country, city, usage_cat etc, after grouping the data, I want to pick the 2 groups with the highest count and want a grouped histogram for these 2 groups data.
Similarly for percentiles, after picking 2 highest count groups I want to show the percentile difference between these two groups
I understand now! Thanks for explaining! I don't think the core visualizations in Kibana allow you to do that smart column selection at the moment. For this type of advanced handling of the data, you might want to take a look at Canvas and their expression language or Vega.
Q: By highest count, do you mean the highest number of documents? If so, wouldn't "country" always be selected over city or region? Or are you referring to the highest number of categories in that group? (which, then, it will always be the smallest region). I'm interested in this type of dynamic aggregation and want to know more
To answer your question, by highest count, I mean group containing the highest number of documents, Like there are 10 categorical columns on which want to group by starting from the region(North/ South), then country, city, usage category, brand, subbrand, configuration, sub configuration etc, so for example, if after group by we get 20 groups, I want to pick 2 groups where the count of documents are highest and want to plot a histogram to make a comparison between data of these two groups,
One visualization (bar plot) to make a comparison between (10, 50,90 ) percentiles of these two groups. For this plot, I am not able to find anything to compute nth percentile for aggregate in vega-lite
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.