This article is also available in Portuguese.
Structured logging has almost became a standard in the industry, allowing for easy understanding and parsing of the logs. While the Elastic-Agent provides integrations to a number of applications, allowing to easily ingest and parse logs, when it comes to our own applications, a bit of work needs to be done.
Elastic-Agent x Filebeat
- Elastic-Agent is a single, unified way to add monitoring for logs, metrics, and other types of data to a host.
- Filebeat is a lightweight shipper for forwarding and centralising log data.
Which one should I use? The Elastic-Agent provides the best experience, everything can be configured via Kibana, you can see the Elastic-Agent's logs on Kibana as well as it's healthy status.
If for some reason you cannot run the Elastic-Agent, Filebeat is still an option, we will cover its configuration at the end of the article.
0. Elastic-Agent basics
This article assumes you are familiar with the Elastic-Agent, already have got one deployed and understand concepts like integrations, policy, etc.
1. Log as JSON
That's it. If your logs are JSON objects, Elastic-Agent can already parse it, then it becomes a matter of fine tuning a few things to make sure the timestamp is ingested correctly and the fields have got the correct type.
2. Some example logs
Let's use those logs as example:
{"level":"info","time":"2022-11-28T12:00:00+01:00","message":"Starting Advent Calendar demo", "status_code": 200}
{"level":"info","time":"2022-11-28T12:00:32+01:00","message":"First line", "status_code": 300}
{"level":"debug","time":"2022-11-28T12:08:32+01:00","message":"Second line", "status_code": 400}
{"level":"error","time":"2022-11-28T12:09:32+01:00","message":"Third line", "status_code": 500}
{"level":"info","time":"2022-11-28T12:10:32+01:00","message":"Forth line", "status_code": 100}
{"level":"warn","time":"2022-11-28T12:11:32+01:00","message":"Fith line", "status_code": 200}
We have three fields there:
-
level
: it's the log level, we want to filter it as keyworkd (e.g:info
,error
,debug
, etc). -
time
: is the time when the log line was written, we need to tell the Elastic-Agent to use it as the time for the log entry instead the time of ingestion. -
message
: it's the message itself, we will consider it as a free text field. -
status_code
: a numeric field simulating a HTTP status code.
3. Setup the integration
We will setup the Custom Logs integration. Aside adding the paths to the files we want to harvester, we need to add two optional configurations:
- Processors: as the name suggests they can enrich, modify our events.
- Custom configurations: well, they are custom configurations for our input.
Under the hood (at the time of writing) the Elastic-Agent will run a Filebeat instance, so all documentation for those optional configurations are Filebeat's documentation. The Custom Log integration uses the Log input under the hood, so the documentation we are interested are:
The custom configuration we need is:
Processors
We will need two processors:
-
timestamp
: It will parse our timestamp and correctly set it on the final event -
drop_fields
: This one is optional, but there is no need to keep thetime
field there if we already have correctly set the@timestamp
in the event.
- timestamp:
field: time
layouts:
- '2006-01-02T15:04:05Z07:00'
test:
- '2022-08-31T12:07:32+02:00'
- drop_fields:
fields:
- time
The only caveat here is that the timestamp
processor is still in beta, however it's stable enough to be used. Anyway, keep that in mind when using it.
Custom configuration
The custom configuration is about telling the Log input we want it to parse the data as JSON, override any keys that already exist in the event and if there are errors, add an error key to the final event so we can know what is happening.
json:
keys_under_root: true
add_error_key: true
message_key: message
overwrite_keys: true
That is how everything will look on Kibana:
4. Test it
Save the integration, wait for the policy update propagate to your Elastic-Agent, add some data to your log file, then go see the harvested logs on Kibana.
5. Mappings
Now that we have some data, let's make sure Elasticsearch understands our data correctly, for that we need to set the mappings for our data.
Head to Fleet > Agent Policies
, click on the policy name, then on the integration name. On the next screen go to Change defaults > Advanced options
, at the very bottom there is the Mappings section.
Click on the "edit" button (the little pencil) and add the following mappings:
Then click "Next" until the "Review" step, then click "Save component template".
6. Test it (again)
Head to Discover, search for some data, expand one of the documents and you will see the fields correctly mapped.
Now you can do queries like status_code >= 400
.
7. How should I define my log keys?
The best way to define your log keys is to use the Elastic Common Schema (ECS). ECS is an open source specification, developed with support from the Elastic user community. ECS defines a common set of fields to be used when storing event data in Elasticsearch, such as logs and metrics.
8. What about a standalone Filebeat?
Well, it's pretty much the same idea, but instead of configuring an Elastic-Agent integration we will configure the input directly on filebeat.yml
(Filebeat's configuration file).
When running a standalone Filebeat, it is better to use the filestream input. The concepts are all the same, but the configuration keys are slightly different. For the sake of brevity here is the input configuration of filebeat.yml
:
filebeat.inputs:
- type: filestream
id: advent-calendar-2022
enabled: true
paths:
- /tmp/flog.log
parsers:
- ndjson:
target: ""
add_error_key: true
processors:
- timestamp:
field: time
layouts:
- '2006-01-02T15:04:05Z07:00'
test:
- '2022-08-31T12:07:32+02:00'
- drop_fields:
fields: [time]
What about mappings?
They can also be edited on Kibana. Go to Stack Management > Index Management > Index Templates
, search for filebeat
. By default Filebeat creates a data stream named filebeat-<version>
, so at the time of writing, that is filebeat-8.5.2
. Click on it, then on "manage", and "edit" in the menu menu that will appear. Click "Next" until section 4, Mappings, and set the mappings as on step 5.
The last step is to create a data view, go to Stack Management > Data Views
, then click on "Create data view", give it a name and for the "Index pattern", add filebeat-8.5.2*
. If yo do not want to use the timestamp processor, you can change here the timestamp field. Click on "Save data view to Kibana".
Here is how it looks like:
Head back to "Discover", select the new data view (filebeat-8.5.2
in our case) and you will be able to see all your data: