Here is my index structure: type, times.
The type field can be either "error" or "valid".
I would like to have the following datatable:
+--------------------+--------------------------+--------------+
| @timestamp per day | type.keyword: Descending | Sum of times |
+--------------------+--------------------------+--------------+
| data... | data... | data... |
+--------------------+--------------------------+--------------+
But I would like to have only 2 lines in that datatable: the error and the valid lines for the latest day of the Date Histogram aggregation over @timestamp field with a Daily Interval.
Here is my current Data configuration.
Metrics:
Aggregation: Sum
Field: times
Buckets:
Split Rows
Aggregation: Date Histogram
Field: @timestamp
Interval: Daily
JSON Input: {"order": {"_key": "desc"}}
Split Rows
Sub Aggregation: Terms
Field: type.keyword
Order By: Term
Order: Descending / Size: 5
Actually, I would like something like "size": 1 to be added to the JSON Input of the Date Histogram... How could I do that?
The Date histogram aggregation in Elasticsearch doesn't support size the way the Terms agg does, instead you have to filter the range you are querying.
To do this in Kibana, you'll want to use the time filter at the top right of Kibana to control the time range being displayed.
The problem with this workaround is that as I don't know the last @timestamp per day that will be available in my data in advance, I'm not able to pre-select such date in the top right Kibana time filter when displaying my dashboard...
Can't I do that in another way with another Buckets configurations?
If you know the start, you can use the Date Math aggregation instead of the Date Histogram, and do something like this:
This will give you the last 2 days, and only the last 2 days, rounded to the nearest day. The problem with that is then you can't control the range, so if you look at it at 12:01am, it'll show the current, new day, with almost no data, and the data will change depending on when you look at it/run the query. But that will give you the grouping you want, at least for the immediate 2 day range.
I don't know the last @timestamp per day that will be available in my data in advance
I don't understand what you mean by that. Even if the data is coming in after the fact, seeng the historical data seems like it would smooth over that problem. I probably just don't fully understand your data and what you're trying to do with it. It sounds like this is more or less what you want though:
Actually, I've asked another question to compute a % ratio between the valid data and error ones and first wanted to display it in a Datatable before seeing in that forum that this was impossible (to create a column with computed information).
So I've created this graph and as you can see in the following monthly graph (aggregated on days), I don't have data every days:
So for the moment, the workaround I've found in order to have the values displayed in a table, is to do the following Data configuration:
Aggregation: Sum
Field: times
Buckets:
Split Rows
Aggregation: Terms
Field: from_host.keyword
Order By: Term
Order: Descending / Size: 30
Custom Label: From Host
Split Table (Rows)
Aggregation: Date Histogram
Field: @timestamp
Interval: Daily
JSON Input: {"order": {"_key": "desc"}}
Custom Label: Day
Split Table (Columns)
Sub Aggregation: Filters
Filter 1 - Valids: type.keyword:valid
Filter 2 - Errors: type.keyword:error
Order: Descending / Size: 5
I now have the following table:
+-----------------------------------------------------+
| 2018-02-07: Day |
+--------------------------+--------------------------+
| Valids: filters | Errors: filters |
+-----------+--------------+-----------+--------------+
| From Host | Sum of times | From Host | Sum of times |
+-----------+--------------+-----------+--------------+
| host1 | 54 | host1 | 1 |
| host2 | 72 | host1 | 8 |
| ... | ... | ... | ... |
+-----------+--------------+-----------+--------------+
| 2018-02-06: Day |
+--------------------------+--------------------------+
| Valids: filters | Errors: filters |
+-----------+--------------+-----------+--------------+
| From Host | Sum of times | From Host | Sum of times |
+-----------+--------------+-----------+--------------+
| ... | ... | ... | ... |
My ideal would be to simply display a % ratio between valids and errors and only display the data for the latest day for which I have data... it seems impossible for the ratio, but perhaps it is for the "keep only latest day for which we have data" requirement?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.