Kibana is a useful tool for monitoring applications and services to ensure they are operating within specified service level objectives. Service level indicators (SLIs) are measurable aspects of a service, such as error codes and latency. Service level objectives (SLOs) define how an application or service is expected to perform as measured by the SLIs, and in a way, set service uptime and availability goals. As logging and metrics data generated by applications grow, so do the demands on the Elasticsearch cluster for processing aggregations over SLI data.
If you have ever assembled an SLO dashboard for a highly dense metrics dataset, you might already know how taxing SLI visualizations backed by millions of events can be on a cluster as each dashboard visualization performs one or more aggregations against the backing indices. One such aggregation, for example, might show the number of HTTP errors grouped by response code over a specified time interval. Another might aggregate proxy logs to show backend request latency over time.
Enter Elasticsearch Transforms. Transforms can be used to pre-aggregate SLI metrics, such as HTTP response codes, for SLO dashboards. Transforms query over existing indices then write summarized data to smaller indices that can be used by visualizations, allowing for fast retrieval of aggregated data without searching against the entire dataset.
Here, we will show you how to set up and use a transform using the sample web logs provided in Kibana. The following was performed on 8.5.2 of the Elastic Stack running in Elastic Cloud.
Loading The Sample Data
- Follow the Kibana Quick Start guide to add sample web logs data.
- Use Discover to gain some familiarity with the web log data and fields.
1. Configuration
We will be creating a pivot transform.
-
Open the main menu, then Stack Management > Transforms > Create Transform.
-
Choose Kibana Sample Data Logs as the data source.
-
Be sure Pivot is selected.
-
Set Group by to
@timestamp
. Click the pencil icon and set the Interval to 1h. -
Next, we will define the aggregations we want to execute and send to the transform destination index. Click Add an aggregation …, then type
response
to filter the selection box. Click filter(response).
-
Fill in the filter aggregation property details provided in the table. Add a range query in the should boolean clause as shown below, then click Apply.
Property Value Aggregation name response.2xx
Field response.keyword
Aggregation filter
Filter query bool
{ "must": [], "must_not": [], "should": [ { "range": { "response.keyword": { "gte": "200", "lt": "300" } } ] }
-
Continue adding three additional parent aggregations for 3xx, 4xx, and 5xx response status codes. Be sure to select Add an aggregation … for each group of status codes.
-
Let's add one more aggregation for all response codes. Use a value count aggregation on the
response.keyword
field and name the aggregationresponse.total
. -
With our five aggregations grouped by date, the transform preview should contain six fields. The preview shows a sample of the data that will be indexed to the destination transform index when the transform executes. If the preview looks good, click Next.
2. Transform Details
-
Provide a name for the transform in the Transform ID box, an optional description, and a destination index.
-
Click Next.
-
Click Create and start.
Transform Status
The Transforms management page should show the transform as started. Click the arrow next to the transform ID, and select Stats to check its progress.
Visualizing
The aggregated transform data can now be used for visualizations. Open Discover and select the data view (aka, index pattern) for the transform destination index, then inspect a sample document. Be sure to set the time picker far enough back to view the data set.
The web server was not very busy during the 12:00 hour, only serving 6 requests. A single Lens visualization can show the SLO target.
Create A Visualization
Add a Lens visualization with the following configuration to a new or existing dashboard:
Configuration | Values |
---|---|
Visualization type | Lens/Area |
Data view | transform_sli_data_log_responses |
Horizontal axis |
Functions: Date histogram Field: @timestamp Minimum interval: 1h Drop partial intervals: disabled |
Vertical axis (I) |
Success Rate > Data Method: Formula Formula: (sum(response.2xx) + sum(response.3xx) + sum(response.4xx))/sum(response.total) Appearance > Name: Success Rate Value format: Percent |
Vertical axis (II) |
Failure Rate > Data Method: Formula Formula: sum(response.5xx)/sum(response.value_count) Appearance > Name: Failure Rate Value format: Percent |
Reference lines | transform_sli_data_log_responses Vertical left axis > Method: Static value Reference line value: 0.95 Icon decoration: Alert Line: 2px Color: #F70E0E |
Left axis | Axis title > Custom "Request Rate" |