the following timelion query:
.es(index='9x',timefield='SampleTime',metric='avg:volume_nfs_write_data',split='volume_instance_name:10',q='volume_instance_name:x123').add(0.0001).fit('average').subtract(0.0001).divide(1024).divide(1024).lines(3,2).title('Volume Write Data (MB/sec)')
shows this chart:
However, when applying split function like this for the exact same time frame:
.es(index='9x',timefield='SampleTime',metric='avg:volume_nfs_write_data',split='volume_instance_name:10').add(0.0001).fit('average').subtract(0.0001).divide(1024).divide(1024).lines(3,2).title('Volume Write Data (MB/sec)')
i get a complete different chart:
i would expect that split to show the results in a descending order, including something similar to the first chart.
what am i missing here?
In your first expression, you have a split on volume_instance_name, and also a q='volume_instance_name:x123' - so the split is pointless there.
In the second expression, try adding a label to see what volume_instance_name values there are. I would guess that x123 isn't seen as one of the top 10 records, according to the average volume_nfs_write_data over time.
the first query with a specific filtering on volume_instance_name is to show that there is a volume with high values nfs_write_data, while when i remove this focus of specific volume and remain only with the split applied in the query, i dont see anything that resembles those values. The peaks in the graph below belong to another volume and are much lower than the one at the top graph.
the rest of the values are 0.
i expected this volume to appear in the lower graph, as obviously it has values much higher than 0.
with that explanation - any ideas?
could you try doing similar query (without all the calculations) in visualize (where you can then inspect the ES query) ?
try an area chart with:
x axis: date histogram on SampleTime field
series split: terms aggregation on volume_instance_name field
metric: average on volume_nfs_write_data field
if i do this in Kibana visualize it works, however there are few reasons that i want it in timelion:
easier to do calculations on values (instead of using scripted fields).
timelion loads way faster than Kibana visualization.
timelion charts are much more beautiful than Kibana (some users are actually care about it)
it is easier to set static values in timelion (with static function) than in Kibana, since i can control the color of it and it doesn't show a static line per value in the chart (like Kibana does), and i need that to draw 'zones' (green/yellow/red).
Do you have more than 10 volume instances in your data? If so, maybe try split by at least the number of volume instances you have, instead of just 10.
I'm not sure how Timelion determines which terms (volume instances) count as the top 10 when using an average metric (and I think its just using being driven by Elasticsearch internally), but it looks like the instance you see in the first screenshot is just not in that top 10.
A couple things you can try:
max aggregation instead of avg
Change the time selection to focus more on the area where there's high data you're interested in
Make multiple charts each with their own query that selects a group of 10 volume instances, and put all the charts on a dashboard so you can see all the volume instances at once. You may way to find a way to do this with automation. Since you have a few hundreds, you probably will need more than 1 dashboard though.
If it is anomalies in the data that you're interested in, check out the Machine Learning feature in X-Pack
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.