This post is also available in portuguese.
The Challenge at the North Pole
"Ho ho... oh no!" exclaimed Santa Claus on a frosty morning. "There are only two days until Christmas! How can I keep track of the wishes of billions of children without proper monitoring?"
For centuries, Santa's workshop relied on magic dust and elf intuition to track toy production. But in 2024, even the North Pole needs a tech upgrade! Today, we’ll help Santa modernize his factory by implementing a real-time monitoring solution with the Elastic Stack.
Requirements
Mrs. Claus was clear about what we need:
- Track toy production across all factory lines
- Monitor elf performance (keeping the Christmas spirit high!)
- Ensure quality control (no child should receive broken toys!)
- Monitor factory conditions (elves need ideal temperatures for their hot cocoa!)
Magical Prerequisites
Before we start enchanting the factory, make sure you have:
- Python 3.8+ (tested by elves, approved by Santa)
- Access to Elastic Cloud (or local Elastic Stack 8.16)
Factory Structure
First, let’s organize our factory like Santa organizes his gift list. Clone the repository from GitHub:
git clone https://github.com/salgado/santa_advent_monitoring.git
cd santa_advent_monitoring
By the end, your folder structure will look like this:
santa_advent_monitoring/
├── manage_simulation.sh
├── toy_workshop_simulator.py
│
├── var/
│ └── log/
│ └── workshop/
│ └── production/
└── README.md
The Factory Example
We created a Python script that simulates Santa’s workshop, generating JSON logs with toy production metrics. The simulation is managed by a Bash script, responsible for starting, stopping, and monitoring production.
The Python code generates realistic data on toy types, production rates, quality scores, and environmental conditions. The Bash script manages the simulation lifecycle, rotates logs, and provides status updates. Think of it as a miniature version of a real production line—but instead of physical toys, it produces structured data to feed the Elastic Stack in real time.
Factory JSON Example
{
"@timestamp": "2024-12-11T16:54:36.980882",
"toy_type": "robot",
"production_line": "line1",
"production_rate": 101,
"quality_score": 95,
"elf_id": "elf_20",
"errors_detected": 0,
"temperature": 22.8,
"humidity": 47.3,
"shift": "evening",
"machine_status": "normal",
"toys_completed": 25,
"version": "8.16.1"
}
This log contains production information: toy type, production line, rate, quality, environmental conditions, and more.
Field Descriptions
Field | Description | Examples | Importance |
---|---|---|---|
toy_type | Type of toy produced | robot, doll, car, puzzle, board_game | Defines complexity and production rate |
production_line | Production line | line1, line2, line3 | Locates production origin |
quality_score | Quality score | 0-100 | Critical: must be ≥ 90 |
elf_id | Elf identifier | elf_1 to elf_20 | Tracks performance |
errors_detected | Errors detected | 0-5 | Critical if > 3 |
machine_status | Machine status | normal, error | Indicates production health |
toys_completed | Completed units | 15-50 | Measures productivity |
Initial Factory Configuration
configs/simulator/production_config.yml
:
toy_types:
robot:
base_rate: 100
complexity: 0.8
min_quality: 85
components: ["circuit_board", "motors", "sensors"]
doll:
base_rate: 150
complexity: 0.6
min_quality: 88
components: ["fabric", "stuffing", "clothes"]
# ... more toy types ...
production_lines:
line1:
efficiency: 0.95
error_rate: 0.02
maintenance_schedule: "0 */4 * * *"
# ... more production lines ...
Implementing the Factory Simulator
Let’s get Santa’s factory up and running!
The Production Simulator
Ensure you’re in the correct directory:
cd santa_advent_monitoring
Management Script
manage_simulation.sh
Starting the Simulation
To start toy production:
./manage_simulation.sh start
This starts the simulator in the background, creates the log directory structure, and runs toy_workshop_simulator.py
, which generates JSON events simulating production. The process PID is saved in simulator.pid
for future management, and all logs go to toys.log
. We’re powering up our virtual factory!
Connecting Our Factory to the Cloud
With the simulator running, let’s connect it to Elastic Cloud using the Elastic Agent.
Elastic Agent Installation
Follow the instructions in Kibana under Fleet → Agents to install the Elastic Agent for your environment.
Configure the factory log path:
/your-full-path/santa_advent_monitoring/var/log/workshop/production/toys.log
Note: At the end of the installation, selecting the "Enroll in Fleet" option will also set up the Fleet Server.
Configuring Factory Log Collection
With logs being generated, we need to configure the Elastic Agent to collect them. Let’s adjust the "Custom Logs" integration.
Navigating to the Configuration
- In Kibana, go to Fleet
- Under Agent policies, select Agent policy 1
- Click Add integration
- Search for "Custom logs" and select it
Adjusting the Integration
1. Basic Settings
- Integration name:
santa_workshop_logs
- Description:
Santa's Workshop Pro
2. Log Path
/Users/username/your_path/santa_advent_monitoring/var/log/workshop/production/toys.log
3. Advanced Configurations
In "Custom configurations":
json:
keys_under_root: true
add_error_key: true
overwrite_keys: true
decode_json_fields:
fields: ["message"]
target: ""
process_array: false
Key points:
- json.keys_under_root: Allows fields like
@timestamp
,toy_type
,production_rate
to appear at the root level for easier searches. - json.add_error_key: Adds an
error
field if parsing issues occur, helping identify malformed entries. - json.overwrite_keys: In case of duplicate keys, the latest one prevails. Maintains data consistency.
4. Dataset
- Dataset name: "generic" (or "santa_workshop" if preferred)
- Namespace: "default"
5. Finalizing
- Under "Where to add this integration?", confirm that Agent policy 1 is selected
- Click Save integration
Verifying the Configuration
After saving, wait a few moments. In Kibana, go to Discover and look for the logs-*
index. You should see events arriving, showing toy types, production rates, quality, and more.
Building Santa’s Command Center
Let’s create a dashboard that will impress even the most technical elf.
Initial Data View Setup
- In Stack Management → Data Views → Create data view:
- Name:
santa-workshop
- Index pattern:
logs-*
- Timestamp field:
@timestamp
- Name:
Creating the Dashboard
- In Kibana, go to Dashboards
- Click Create dashboard
Creating Visualizations
Use Create visualization for each one:
1. Total Toys Produced (Metric)
- Metric: Sum of
toys_completed
- Title: "Total Toys Produced"
2. Quality Score (Metric)
- Metric: Average of
quality_score
- Title: "Quality Score"
- Format: Percentage
3. Average Production Rate (Metric)
- Metric: Average of
production_rate
- Title: "Average Production Rate (per hour)"
4. Distribution by Toy Type (Pie)
- Metric: Count
- Split slices:
toy_type
- Title: "Production by Toy Type"
5. Line Performance (Vertical Bar)
- Y-axis:
production_rate
- X-axis:
production_line
- Title: "Production Line Performance"
6. Environmental Conditions (Line)
- Y-axis:
temperature
andhumidity
- X-axis:
@timestamp
- Title: "Workshop Environmental Conditions"
Dashboard Layout
Organize visualizations:
- First row: Key metrics (Total, Quality, Rate)
- Second row: Toy type distribution and Line performance
- Third row: Environmental conditions
Resize as needed. The goal is to make the dashboard easy to interpret at a glance.
Proactive Monitoring with SLOs and Alerts
Ho ho ho! Let’s ensure quality remains high! Let’s configure a Service Level Objective (SLO).
Creating a Quality SLO
-
In Observability → SLOs:
- Click Create SLO
-
SLI Definition:
- SLI type: Custom Query
- Data view:
logs-*
- Timestamp field:
@timestamp
- Query filter:
machine_status:"normal"
- Good query:
quality_score >= 90
- Total query:
quality_score:*
- Group by:
production_line
-
Objectives:
- Time window: Rolling
- Duration: 30 days
- Budgeting method: Occurrences
- Target / SLO (%): 94
-
Description:
- Name: "Quality Control SLO"
- Description: "Monitoring production quality (goal: 94% with score ≥ 90)"
- Tags: quality, production, christmas-2024
Setting Up Alerts
We’ll set clear goals:
-
Gift Quality SLO:
- Target: 99% ≥ 90 quality
- 24-hour window
- 30-day evaluation
-
Production Efficiency SLO:
- Target: ≥ 95% of planned
- Real-time monitoring
Creating an Alert Rule
-
Observability → Alerts → Create rule
-
Type: "SLO burn rate"
-
Alert Configuration:
- Name: "Low Quality Alert"
- SLO: select the created SLO
- Alert if it falls below the goal
- Frequency: every 5 minutes
- Notifications: Email to the Head Elf, production Slack channel, or webhook for tickets
-
Alert Message:
- Title: " Production Quality Alert"
- Message: "Attention elves! Line {{production_line}} is below the quality target ({{current_value}}%). Check immediately!"
Example Alert in Action
{
"@timestamp": "2024-12-11T16:54:40.999778",
"toy_type": "puzzle",
"production_line": "line1",
"production_rate": 103,
"quality_score": 69,
"elf_id": "elf_4",
"errors_detected": 5,
"temperature": 20.8,
"humidity": 50.9,
"shift": "evening",
"machine_status": "error",
"toys_completed": 25,
"version": "8.16.1"
}
In this case, quality dropped. The alert will help quickly fix the issue.
Configuring Alerts via Elasticsearch Query
- Stack Management → Rules → Create rule
- Type: "Elasticsearch query"
- Query:
{ "bool": { "must": [ { "range": { "quality_score": { "lt": 90 } } } ] } }
- Parameters:
- Name: "Gift Quality Alert"
- Indices:
logs-*
- Schedule: Every 5 minutes
- Actions: Notify supervisors (email/Slack)
SLO Panel
Our SLO panel will show quality trends, maintaining the Christmas spirit and children’s happiness.
Conclusion: A Modern and Magical Workshop!
Our complete solution provides:
- Data collection for production
- Elastic Agent setup
- Visualization dashboard
- Proactive SLOs and alerts
Santa’s workshop now combines magic and technology! “This system has transformed gift production,” celebrates Bernard, Chief Operations Elf. “Now we ensure every child gets their perfect gift on time!”
Additional Resources
Remember: The best gifts arrive at the right time, and the best monitoring system makes that possible!
#ElasticAdvent #ModernSanta #ObservabilityMagic #ElasticsearchForElves #Elastic