Issue with unintentionally becoming non_running in Logstash management

Using Logstash management, I was adding pipeline settings from Kibana. Although it was working fine before, after adding a pipeline, the newly added pipeline unintentionally became non-running.

The environment is as follows:

  • Elasticsearch 7.15.2 on a cluster of 5 nodes
  • Logstash 8.7.0 using JDBC plugin to ingest data from a database into Elasticsearch
  • All running on RHEL7
  • Licensed

I aligned the versions in my local environment and conducted the verification, but the issue could not be reproduced.

2024-06-07 12:30:31 [2024-06-07T03:30:31,106][INFO ][logstash.agent           ] Pipelines running {:count=>29, :running_pipelines=>[:".monitoring-logstash", :test01, :test02, :test03, :test04, :test05, :test06, :test07, :test08, :test09, :test10, :test11, :test12, :test13, :test14, :test15, :test16, :test17, :test18, :test19, :test20, :test21, :test22, :test23, :test24, :test26, :test27, :test28, :test29], :non_running_pipelines=>[]}

(Here, additional pipelines are being added to "non_running_pipelines" other than the intended one, and it's causing a problem.)

Here are the steps I tried to address the issue, listed in bullet points:

  • Copied the non_running pipeline and registered it with a different ID.
    • Another pipeline moved to non_running.
  • Added a simple pipeline (which just retrieves the date and outputs it to standard output) with the ID test.
    • Similarly, another previously running pipeline moved to non_running.
  • Made a non-impacting edit to the pipeline that moved to non_running and saved it.
    • It became running again, but another pipeline moved to non_running.

Although I failed to reproduce the issue, I have created a minimal environment with the same versions as the target environment and uploaded it to GitHub. Here is the link:
shibadog/sample-logstash-management (github.com)

Based on the testing so far, it seems likely that we are hitting some kind of limit related to resources or configurations.

Does anyone have insights on what might be causing this? We are looking for potential candidates for investigation.

Thank you for your assistance.

Welcome!

Mixing major versions is probably the cause here.

Upgrade everything to 8.14 and you should be in a much better position.

Thank you for your response.

I see your point.

We are currently considering an upgrade, but we cannot do it immediately. Therefore, I thought about downgrading the version of Logstash for testing.

If we manage to reproduce the issue, I will consult with you again.

While you are at it, upgrade to 7.17.latest. There are plenty of bug and security fixes...

I see... I'll do my best. :’(

What do you have in Logstash logs when these things happens?

Thank you for your response.

In this case, no logs are being generated.

If I had to say, the following logs are being generated, and pipelines that were not added to non_running_pipelines are included.

Additionally, I have also observed messages like the following.

Metric registration error: `input_throughput` could not be registered in namespace `[:stats, :pipelines, :{pipelines that were excluded}, :flow]`

All of them are INFO level logs, so I believe they indicate normal operation.

For troubleshooting purposes, I installed version 7.15.2, which is the same version as Elasticsearch, and deployed the same pipelines.

As a result, the issue did not occur! :slight_smile:

As you pointed out, it seems to be an issue caused by version differences outside of the supported range... I will check the support list and update to the appropriate upper limit version (in this case, 7.17.latest) first.

Support Matrix | Elastic

However, I am curious about the underlying reasoning behind this issue...