Progress for Logstash contributions

It seems the progress for outside pull requests to the Logstash gits is very slow. Or am I looking in the wrong place? If not, what's the plan to improve the situation? I have a laundry list of things to look into, but when not even documentation cleanups get any kind of attention I'm likely to turn my attention elsewhere.

2 Likes

We're working on the LS docs at the moment, expect some ongoing changes and improvements there :slight_smile:

Let me ask the LS team to comment on the other part though!

My favorite slogan is "If a user has a bad time, it's a bug!" - let's apply this here!

In this case, the bug is that PRs and issues are lying unattended or forgotten. I agree this is a bug. What follows is some of my thoughts around debugging this and hopefully what we can do to resolve this bug.

First, let me apologize. I do feel bad having let contributions lay without any communication from us. We're going to work on improving that.

Second, since this is going to be a long post, I want to summarize briefly before I get into data and action items. We acknowledge that we are not adequately attending to many PRs and issues filed within the community. I appreciate you bringing this to our attention more vocally! We will work towards fixing it. The community is what makes Logstash and our other projects so powerful, so we can't accidentally neglect important things!


Call to Action

We need your help!

Right now, most of the PR and issue review is done by our small Logstash team here at Elastic. This isn't scaling well. I think we really need help from the community here.

There are many aspects to an issue or pull request, in no particular order:

  • Basically, It's totally OK to ask what the status of a ticket is. It's also OK to ask how you can help! Commenting on an issue signals to us that you are interested in it.
  • If it's a bug, can anyone reproduce it? How? Can we at least confirm multiple people experience it? Across which versions?
  • If it's a new idea, can anyone agree to it? If it's not your idea, do you like the idea? Could you improve the idea? Do you dislike the idea?
  • If it's a patch or pull request, has anyone tested it? Can you review it for functionality or behavior (not just code review)? Does the PR have tests? Can you confirm that the PR fixes the problem as described? Does the commit message for the PR have enough context and information that you understand the change?
  • If it's a new feature, does it break backwards compatibility?
  • If it's a new feature, are the configuration names good? (subjective, can you think of a better name for this setting?). Are there docs for it? Do you understand the docs? Are the docs missing something?
  • Is it your pull request? Did you sign the CLA?

Put more directly, if all of the following conditions are met on any PR, I'd be happy to merge a patch without much of any review from our staff:

  • two or three people reviewed the idea, design, implementation, tests, and docs were all thumbs-up.
  • no backwards incompatibilities introduced
  • at least one person had actually tested the patch in some kind of staging/prod environment and confirmed it worked.

We can build great things with small investments from many people, so even just saying “I like this idea” to a feature proposal, or saying “I tested this patch and it fixed the problem for me” is of tremendous value!

The logstash dev team's role in this process should be to merge code and publish releases. Anything outside of that, everyone is invited to help with.

In case it's not clear, all contributions are extremely valuable - testing, docs review, idea feedback etc!

(continued on next comment)

3 Likes

Where are we today?

I'll try to detail as much as I can think of for the sake of transparency and with the hope that I can frame the situation in a way that we can move forward in a positive way :smile:

First, staffing: Logstash is staffed by a small team of dedicated software engineers. We are flanked by two wonderful product managers (Tanya and Alvin) who help us make sure we stay on track in terms of what we can deliver, what features you (users) care most about, early adopter feedback, first-time user experience, etc. Supporting us further are tech writers (Paul and Deb), customer support teams (so many nice humans!), and developer advocates :heart:.

Second, sources of workload: The whole Logstash suite currently has 1282 open issues and 383 open pull requests. Those numbers include the main logstash repo, all logstash-plugins repos, and logstash-forwarder. We have other sources of work (teaching training, speaking at conferences, participating in webinars, etc), but the main work I'll focus on in this post is the "write code, fix bugs, merge patches" side of things - meaning mostly software development.

Third, sources of stalls: All changes going into any Elastic project must have been authored by someone who has signed our CLA. I would estimate roughly 70% of all PRs to Logstash projects are contributed by folks who haven’t signed the CLA.

Let's look at our velocity over the past few weeks:

Here's a graph of issue creation and closure over time, grouped by week (Yay Kibana 4!)

The top graph is "closed issues per week" and the bottom is "created issues per week". Hand-wave estimates:

  • About 45 issues and pull requests (combined) are closed each week.
  • About 90 issues and pull requests (combined) are created each week.

Unless something improves, we will never catch up as new tickets are created twice as fast as we close them.

We recognize that not all issues and pull requests have the same cost to resolve. For example, we have tickets discussing new execution models as well as message persistence which take us months to design (not including any development, testing, and documentation time). Other issues are much simpler and take only minutes to resolve or review. Overtime, we hope to become better at responding to items that take a short time to resolve quickly.


Summarizing: We (the Logstash dev team) need to do better at helping make sure issues don't fall through the cracks, and we also need your (the community as a whole) help in helping us achieve this.

I'm hoping to publish our "logstash issues" dashboard live soon (next few weeks, as time permits).

Hopefully this post has helped clarify that we believe the slowness on PR merging is a problem and that we are working towards improving things :wink:

(Collaborators on this post include Suyog Rao, Tanya Bragin, and Alvin Chen)

3 Likes

Thanks for the detailed write-up. Suggestions that I don't think you've covered:

  • Have an internal rotation where you screen all new PRs and issues and take appropriate actions right away, like labeling (or even reviewing) low hanging fruit PRs so that the obvious ones can be merged quickly. Also, it seems issues are often used for questions (people tend to assume that every time a piece of software doesn't work for them it's a bug). Just bounce them to discuss.elastic.co!
  • Close stalled PRs so they're no longer blocking the view. This could include CLA-less PRs after two months of inactivity and other PRs after four months of inactivity.