Stability maintenance release

Hi,

following the development of kibana since versions 6.8.x I noticed a quite increasing instability and unreliability of kibana with every new 7.x version.

Considering the more than 1000 open issues tagged as bug, I ask myself if kibana 7.x versions are really production ready.

Maybe it would be necessary to have a stability maintenance release for kibana instead of introducing new features.

Regards
Ron70

First and foremost, it sucks that you've been experiencing unreliability with Kibana. I see you've posted topics here and filed bugs in the past, and thanks for that! We really do appreciate your contributions on that front as it helps us make Kibana better for everyone.

Features are changes, and changes are a form of instability. They're also complex, and added complexity is always an opportunity for unexpected behaviors (bugs). With software like Kibana, stability is a balancing act - on one hand we want to minimize change to support stability, and on the other hand we want to enhance the product in the ways that help people.

To that end, we support two versions of Kibana today:

  • 6.8 exists for folks that require a highly stable environment above all else
  • 7.x exists for folks that want the newest production quality features available

If folks want to live on the bleeding edge, they can always take our nightly snapshots, but these aren't production quality.

You're right that a ton has changed between 6.8 and 7.3 (latest at the time of this writing). We've added a ton of new apps, new backend APIs, tools, and enhancements to existing apps. In technical terms, Kibana's codebase is 50% larger.

While 1000 open bugs seems like a ton (seriously, it's a daunting number for us), it's not 50% more bugs than we had when we released 6.8. If we were measuring overall stability by open bugs per feature, then 7.3 is one of the most stable releases we've ever had.

Also, new features alone are not the only driver for more bug reports. Kibana has never had as many users nor as many contributors as it does today. It has never been used for as many unique use cases nor for as many mission-critical purposes as it is today. These things contribute to the number of bugs filed because they increase the likelihood of an impactful bug being encountered. By the way, the same is true for non-bug issues as well, and if you compare our ratio of non-bug issues to bug issues, you'll see a decrease in the percentage of bug issues since we released 6.8.

To be honest, even if the numbers might paint a better-than-expected picture, I don't think issue counts are a very valuable measurement here.

For me, as a production user of Kibana and a developer responsible for helping it stay that way, I want to know what steps are being taken to ensure the new features that get added are not going to make Kibana less reliable. Two efforts in particular stand out to me:

  • We are investing heavily in automated testing. We're writing more tests. We're writing better tests. We're running our tests on more operating systems and in more browsers. The increase in new test code between 6.8 and 7.3 greatly outpaces new feature source code, and every push to the repo is now running 100s of thousands of lines of tests over a dozen hours or more (thank goodness for parallelization!). We've delayed exciting features that we felt didn't have thorough enough automated tests.
  • We're identifying and replacing legacy implementations that are fragile and hard to test. You can see this throughout our code for highly-visible features like graph, dashboard, and visualizations, but you can also see it in less visible systems like our core server capabilities and plugin service.

We know there are gaps, and a substantial number of the people working on Kibana are focusing all of their attention on closing those gaps and ensuring Kibana can be as stable and reliable as we can possibly make it. Not shipping features to people in the name of maximum stability is exactly what we're already doing for 6.8, so I don't think that's the right approach for 7.x. Instead, we must continue the efforts that are underway to scale Kibana development in a reliable way.

And of course we must also keep our eye on that bug list and fix things as quickly as we can. That's our source of truth here, and as long as people are reporting bugs we'll work tirelessly to squash them. Maybe not as fast as you report 'em, but as fast as we can knowing any bug might be having a huge impact on someone in production.

4 Likes

Thanx for this in-depth answer and explanations.

I wasn’t aware of the fact that:

  • 6.8 exists for folks that require a highly stable environment above all else
  • 7.x exists for folks that want the newest production quality features available

But thank you for pointing this out.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.