I was trying to do some scripting on the new 'histogram' datatype that comes with 7.6, but histograms don't support scripting.
I thought I might be able to use arrays instead, but have not had any luck because arrays are sorted when accessing the _doc object. I know I could use _source, but that will be painful in the future when I have a large number of documents.
The histogram field does not support scripting indeed. I am very curious what is your use case that it needs to access data counts through a script, are you able to share this need?
I store pre-aggregated data in elasticsearch. I want to generate heatmaps fom this data, but Kibana doesn't support heatmaps from pre-aggregated data, so instead I generate these using Grafana.
In the past I've done this be defining queries per-swimlane with data like:
And just doing a query on each field for each row in the heatmap.
I was hoping to do this with a elasticsearch histogram instead because it should be more space-efficient, as well as meaning I don't need to hand-craft each query in grafana, but there isn't really a way to extract a specific index that I can find.
I tried substituting my key/value pairs with arrays, but that doesn't work due to array sorting.
Instead I've currently chosen a prometheus-esque storage solution:
I can also generate a linearized 'max' approximation via:
def c = doc['data.counts'];
def v = doc['data.values'];
for (i=0; i < c.length-1; i++) {
if (c[i] ==c[i+1]) {
return v[i]
}
}
return v[-1];
This solution is actually less space-efficient than the original key/value solution, but it does make it easier to do various manipulations like the one above
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.