I'd like to hear if there are any suggestions in the ECS framework on how to provide app-specific fields. Do you use your own namespace for that, or just put it in the global namespace? Any recommendations for field names and so on?
Our current logging metadata is quite messy, and I'd like to give it a bit more structure. Most application logs just picks a field name (seemingly at random) and fills it, hoping that no other applications happen to use the same name but with a different field type.
In our team, we try to align our fields with the ECS format (as far as possible) and try to prevent custom namespaces. We are naming our fields by the following priority list:
field is defined in ECS: use ECS aligned field name (e.g. service.name)
fields exists in Beats/APM: use Beats/APM aligned name (e.g. service.environment is used in APM but not in ECS)
similar fields/groups exist in ECS: align with this naming conventions(e.g. there is a field called http.request.body.content which stores the HTTP Request Body so we added our custom field for http.request.body.raw)
no similar fields/group exist and it is a single value: put it directly into root
no similar fields/group exist and it is a group of values: create a custom namespace
We started with Elastic 6.5 where the ECS schema did not exist so we got the main application developers together and developed our shared schema which is stored on a wiki page. All applications align on this schema. After migration to 7.x we updated our schema to align with ECS and we update it whenever we require new fields or a new ECS version is released.
For field names, I found the following documentation from the ECS helpful:
Adding a single custom field at the root of your event can lead to problems if you pick a concept we eventually add to ECS. For example you add custom field { "proxy": "some IP", ... }; and the next version of ECS adds a field set called proxy, then you have a leaf field and an object field of the same name to deal with in your upgrade. In the best case scenario you need to reindex old data to use new field names, if you're not careful and you update the index template and don't notice this, you may get mapping conflicts.
If instead you nest all custom fields under a proper name, you avoid all of these potential problems.
{ "mycompany": {"proxy": "some IP"}, ... }
But if the fields you add are not likely to be added to ECS, then the risk of problems is accordingly low. There's no problem with an arbitrary custom field, even at the root, like { "widget_id": "21bbeef" }.
I had missed the Custom Fields link totally while I was searching for guidance on this. I guess I was searching for the wrong terms, using "payload", "data", "app-specific" and so on. Hopefully this thread is indexed and I'll find it in the future again.
Our approach has been similar to Wolfram's, where we try to squeeze in the data in a vaguely related ECS field, mostly since they have clear names. Opening a company wiki about the used fields sounds like an excellent next step.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.