Naming convention for ingest pipelines etc

rsk0 · July 27, 2023, 9:40pm

Elastic-Provided Naming Convention

Is there a naming convention for ingest pipelines, index templates, component templates, or any other such configuration objects?

I see in the docs [1,2,3,4] there are examples of ingest pipeline names and index/component template names:

logs-my_app-default
my-pipeline
my-pipeline-id
logs-my_app-default
one-pipeline-to-rule-them-all
template_1
template_with_2_shards
logs-my_app-settings
logs-my_app-template

Of course I wouldn't take these as declaring a "convention", but I do take them as partially describing the range of possibilities.

As for characters, I take the examples as effectively stating alphanumerics and dashes and underscores are fair game. That's all I need, really.

As for format, like <data_type>-<name>-<more_detail>, it seems far-fetched to think that these examples dictate any kind of format. I don't see any explicit format conventions in the docs. Could someone confirm?

Proposed Local Convention - Legal

Will these names be accepted with the characters they use and, more importantly, with the format they're in?

I believe the characters are all legal. I'm assuming that the format makes no difference at all.

Format: <company>-<config object type>--<name>

Examples:

acme-index_template--all_logs
acme-component_template--high_ingest_rate
acme-component_template--long_retention-high_replication
acme-ingest_pipeline--compute_ingest_delay

Propose Local Convention - Sensible

Really, this is the heart of my post:

Does this format seem like a good idea?

Again, the format is <company>-<config object type>--<name>.

Here's what I'm trying to do with it:

acme-. Distinguish it clearly from any Elastic-provided built-in configuration object (like index templates), or any potential third-party. Never collide. (The existence of the easy-to-collide-with built-in template "logs" started me down the longer path of this whole coming up with a regularized name format.)
-index_template- Make obvious the config object type. No guessing which things are index templates versus component templates.
--all_logs Make obvious the particular object purpose/name. I feel using a double dash here really visually sets it off and makes it clear. I'd love to hear your thoughts.

rsk0 · July 28, 2023, 3:58pm

Do other Elasticsearch users have naming schemes they use for templates or pipelines?

rsk0 · July 31, 2023, 7:27pm

Do you have a naming scheme for any config items?

Like templates, users, roles, snapshot repos, ILM policies?

leandrojmp · July 31, 2023, 8:02pm

I don't think there is any convention, so you should use what best fit your use case.

But as an example, for templates, mappings and pipelines I prefer to name things around the source of the data.

If the source of data is firewall logs for Fortigate, I will have a fortigate-template.json, a fortigate-mappings.json and a fortigate-settings.json which the template, mappings and settings for the indices related to this data, *-mappings.json and *-settings.json are component templates.

I also have global component templates that starts with base-*.json, like base-mappings.json, base-src-dst-mappings.json etc

I barely use ingest pipeline as I prefer to use logstash and since I also use kafka as a broker I adopted the terms producer and consumer and I use a custom naming folder, something like:

/opt/logstash/pipelines/fortigate/producer/*.conf and /opt/logstash/pipelines/fortigate/consumer/*.conf, the producer receives data from the network devices and send to Kafka and the Consumer consumes the data from Kafka, transform it and send to Elasticsearch.

For the pipeline files I use a numeric naming convention to split them in multiple files and sort the filters in the way I need, like 000-*.conf for inputs, 100-*.conf for parse filter, 200-*.conf for enrich filters like translate, memcached etc and 999-*.conf for outputs.

rsk0 · August 1, 2023, 5:07am

Oh, sorry, I'm not meaning to refer to file names, but rather the names of configuration objects within Elasticsearch.

For example:

% curl $API/_ingest/pipeline | jq keys
[
  "compute-ingest-delay",
  "acme-ingest_pipeline--logs_general",
  "xpack_monitoring_6",
  "xpack_monitoring_7"
]

I think managing configuration files somehow is less problematic than "configuration objects" in the API. (You don't have to worry about conflicting with files that ES comes with, for example.)

Keeping managing all the ingest pipelines, index templates, component templates, users, roles, etc. seems to be made a little easier by having them named clearly.

leandrojmp · August 1, 2023, 12:37pm

There is not much difference, in the end every configuration object is a json which could just be a file that you will use when creating it.

In the previous example, the components templates for mappings and settings would also be named fortigate-mappings and fortigate-settings, the template would just be named fortigate.

For ingest pipelines I would follow a similar approach, starting the name with the source of the data and a description of the pipeline, in my case I indeed have a couple in which I also use the suffix -final-pipeline, which is also a setting in the indice, so for fortigate it would be something like fortigate-final-pipeline.

For ILM policies I just use clearly name that says how many time an index will be in each fase, like hot-7-warm-30, this would mean that an indice will be in the hot phase for 7 days and more 30 days in the warm phase before being deleted.

But as I said, there is no naming convention, the closest you get is the naming scheme for data streams, which is the one that Elastic uses with the Elastic Agent.

For example an Elastic Agent for audit logs from Google Workspace, you would have the following objects:

ingest pipelines starting with logs-google_workspace.*
indices and components templates starting with logs-google_workspace.*

If you avoid naming things that start with logs-*, metrics-*, synthetics-* and and profiling-* you can choose any naming scheme you want.

rsk0 · August 1, 2023, 3:32pm

Got it. Then you've answered my question. There is no convention.

And you've provided some good additional details on the methods you use to name your config objects / files. I could probably add your methods into my scheme, even.

And you've provided a good reference to a related topic, the naming scheme for data streams, yet another example of name scheme development, even if not specifically for ES configuration objects.

Thank you so much for your time and expertise.

system · August 29, 2023, 3:33pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.