A good place for "state of being a canary" field

Where do you think is a good place to indicate that a log message is from a canary?

orchestration.* doesn't seem appropriate.

Maybe something in the upcoming node field set? (Where do I find information about that?)

I’m imagining … maybe node.deployment_role?

Or maybe we need a deployment field set?

{
  "deployment": {
    "canary": true,
    "strategy": "blue/green",
    "step": "package download"
  }
}

Hi @rsk0 I pinged some internal ECS (Elastic Common Schema) folks let's see if anyone responds...

Seem like a good / interesting thought.

1 Like

Hello @rsk0 !

I'd suggest the existing field service.environment . Per the docs as I think it captures the essence of what you're trying to achieve.

Thanks for the feedback, Kylie!

service.environment:

If the same service runs in different environments (production, staging, QA, development, etc.), the environment can identify other instances of the same service. Can also group services and applications from the same environment.

Hm. What I'm thinking about is a kind of deployment that happens within an "environment". Imagine you are wanting to deploy a new version of your service to your production environment. Rather than completely replace all instances at once, you deploy just an instance or handful, confirm it's working, then proceed to replace the remainder. The old deployment and the current one both exist in the "production" environment. (Here's a write up: "What is canary deployment?")

(There does happen to be another deployment strategy, "blue green deployments", that one might describe as having separate environments, like "production blue" and "production green". (I mixed the two strategies in my example JSON earlier.))

Being able to filter on canary status would be immensely useful. There are probably a couple other facts of deployment that would be useful to log.

I tried finding an appropriate field by going through all the field sets in the ECS reference, but I couldn't find any real matches. The most related set seemed to be service, but the semantics of that lead me to think that values in that set should be referring to the service as a whole, rather than the deployment status of a single instance of the service. service.state is very vaguely described, so it seems a possibility, but again doesn't seem appropriate for node-specific info.

There are mentions of a coming node field set, with reference to "node roles", but "role" here, with the examples of "master", "data", "background task", and "ui", does not seem to fit. A node.state might work, but I think node.state is probably better left to other uses (if not meant to be an array).

My best guess for most appropriate place (barring there being a need for an entire deployment-related field set), I think would be maybe node.deployment_role with expected values like "normal" and "canary".

More thoughts on deployment-related information: I've figured out a number of pieces of deployment-related data that would be good to capture:

  • deployment
    • role (canary | regular (or normal) | green | ...)
    • strategy (canary | blue-green | simple | ...)
    • phase (scheduled | awaiting validation | validating | ...)
    • ID

How do you like these?

what can practically get logged

Service instances might not commonly have access to data like the phase of their deployment, and thus might not usually be able to log that, but hopefully can often know if they are canaries (or blues/greens, or whatever). This is very useful for observing deployments.

(Maybe these fields can get populated as the logs go through the pipeline and hit enrichment processors?)

"canary" just means "test instance"

Maybe there's a more general term for the role an instance is serving when it's a canary or a, say, blue deployment? In all deployment strategies maybe what matters is whether an instance is part of the testing group? In "canary" strategy deployments we call these "canaries". In blue-green deployments we call them "blue" or "green" depending. In other deployments they may have other names. Aren't these all serving the same purpose? Maybe we need a single term to refer to them all? So now I'm thinking one of these two possibilities:

  • deployment
    • test_instance: true

or

  • deployment
    • role: test_instance
1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.