Dealing with log messages about source _and_ target

rsk0 · September 7, 2022, 2:28pm

Okay, so…
We’ve got a field that could be used to indicate whether a whole message is related to client or server activity, whether communication is inbound or outbound: network.direction
But in some situations we’re logging about activity that we’re doing both as client and server. Imagine logs for a proxy, for example.
I’m sure there’s no way to indicate per field whether it’s inbound or outbound, and I can’t imagine how that would even work, but that’s something I’m realizing would be really useful.
Any thoughts?

(Elastic Stack Community thread re this topic)

leandrojmp · September 7, 2022, 3:52pm

Can you provide more context and give some examples of your use case? What is the source of the log?

How you will populate the network.direction field depends on your use case, the monitoring context and what you want to see in your logs.

For example, I choose the classify the network direction based in the source and destination addresses, so we use inbound, outbound, internal and external.

My monitoring context is the private network of the company, so anything external to a private ip is considered inbound and anything internal to a public ip is considered outbound, any communication between private addresses are considered internal and anything that pass through our network is considered external.

rsk0 · September 7, 2022, 7:52pm

Let's take the case of user.id. In the context of a user management tool, user.id could refer to either the account accessing the tool or the customer account being worked on by the employee.

Or let's take the case of a microservice receiving a request from an external customer. In the processing of that request it relays its own request to an authentication service. When it logs a message here it has two effective contexts, one where it is the server, one where it is the client. We might want to record information about each context.

leandrojmp · September 7, 2022, 9:35pm

In this case, it would help if those were different fields, ecs even has a suggestion, for case like this, you could use user.target.user.id for the id of the user being worked on.

Or you could rely in one of the categorization fields or other fields in the document.

I see this in a different way, you have a microservice that receive external requests, so the traffic direction is related to this microservice, if it needs to validate the request in another service to allow it or not should not matter for the value of the network.direction, it is still an external request to your service, which could be classified as inbound. It will log the outcome of the external request and, depending on the logic of the service, it may also have other logs related to this request and you may also consume the logs for the other service.

As I said, it depends more on what you want to see from your logs, the ecs fields and value are more a guide on how you can populate it, but you could use anything if this helps you have a better understanding of what is happening.

rsk0 · September 14, 2022, 8:04pm

I'm taking as a premise here that the context of a program in some particular spot could be that the program is behaving as both a server and a client. Also, I'm assuming that it's useful to be able to log a (single) message about both these aspects.

I'm reading your second to last paragraph as saying it is not useful to log about both these things at the same time. Possibly even that a service should only log about the requests it serves, rather than the ones it makes? I think I may be reading wrong.

In any case, you do bring up how we can disambiguate users in our context by making use of user.target.user.*. This seems like exactly the kind of thing that might extend to disambiguating fields generally, and thus could be the solution. I'm going to read more about this. So far it looks like .target.* is only made available on specific field sets (user and service). That probably covers nearly all cases, but my instinct tells me there are likely others. Maybe we need generic sub-fields like target/effective or source/destination or origin/target or client/server that we could apply to virtually any field set?

The other alternative I was considering was using labels prompted by someone in the Elastic Slack. You could say something like {"labels":{"context:url.path":"client"}} to indicate that this message's url.path is referring to the client part of your context. This provides the semantics of your use to someone reading the message, and to specialized programs that will process your message, but something like Kibana wouldn't make use of it.

leandrojmp · September 15, 2022, 1:12pm

The service should log anything you want or find useful, but I think that logging two completely different events in the same log message can make things more confusing.

For example, you have a client with IP A, that make a request to your service that has the IP B, your service then needs to make a request to an external service with IP C.

What will be your source ip and destination ip in this case? Are you going to consider the IP B as a source ip as well? How will your log line be? The source and destination IP could be an array, so you can build something like source.ip: [A, B] and destination.ip: [B, C] and classify your network direction or context based on the source and destination, but this would be much easier if you had different log lines.

Can you give an example of your log in this case?

You can create the .target.* fields under other field as well, the ECS is a guide on how to name and map your fields, if you follow the ECS naming convention, the pre-built dashboards and kibana apps will populate pretty easily, but the ECS is also limited to the use cases that Elastic has experienced, there a lot of kinds of logs and use cases that have no equivalent ECS field, so you can create yourself and suggest elastic to implement it in the next versions

rsk0 · September 21, 2022, 4:26am

From the context of the logging application, A's request has a source of A and destination B, and the logging application's subsequent request has a source of B and a destination of C. If you are in the middle of serving A while you contact C, then both of these events are happening simultaneously.

You could consider them distinct events, and if you need to log something, use only one event or the other for any given message. I agree that this makes reading the messages much easier.

But maybe sometimes you want to share information about both events in a single message. My instinct was to say that this was too hard so don't do it, but I feel it's important to explore.

If user A is contacting your service B at path /inventory/delete, and you, service B, are contacting the auth service C at path /auth, you could log a message like:

{
  "url.path": "/auth",
  "http.response.status_code": 403,
  "labels": {
    "context:url.path":"client",
    "context:http.response.status_code":"server"
   }
}

This message means you (service B) gave your requester (user A) a 403 as a result of your querying the auth service (C) at path /auth. Or, "In the context associated with the url.path field, I am the client" and "In the context associated with the http.response.status_code field, I am the server."

For the scenario with IP addresses:

  "source.ip": "A",
  "destination.ip": "C",
  "labels": {
    "context:source.ip":"server",
    "context:destination.ip":"client"
   }

This means you are acting as a server with regards to the information in the source.ip field. The source.ip is your client. And you are acting as a client with regards to the destination.ip. The destination.ip is the server you're talking to.
So in this example,

    "context:source.ip":"server"

means that "in the context associated with the source.ip field, I am the server", and thus the source of the query in that context is the user at IP A.

Another possibility is that you could list field names in a tags-like array:

  "context": {
    "server": ["source.ip", "http.response.status_code"],
    "client": ["destination.ip", "url.path"]
  }

I think it's possible to encode multi-context information in a single log message. It's a lot of work and added complexity. I think you're right it's probably best generally to just log two separate log messages. But if you find yourself needing to do an aggregation based on something like status codes your service returns versus path contacted on a downstream service, and thus need to create messages with those fields representing different contexts, there's at least these ways of doing it without leaving the fields ambiguous.

system · October 19, 2022, 4:27am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Http fields: inbound or outbound Logs ecs-elastic-common-schema	3	620	December 7, 2021
ECS network.direction field Elasticsearch	2	336	May 26, 2020
ECS - Squid proxy log normalization Elasticsearch ecs-elastic-common-schema	4	1690	August 14, 2019
Difference between source/destination and server/client SIEM ecs-elastic-common-schema	2	2249	September 13, 2019
Elastic Common Schema for IPS Logs Elasticsearch ecs-elastic-common-schema	4	1583	August 23, 2019

Dealing with log messages about source _and_ target

Related topics