When it’s done

There are two popular ways to manage software releases: feature driven or timeboxed. In a feature driven project, each version commits to a predetermined set of changes and the release is done when everything has been finished. In a timeboxed project, releases are done based on calendar dates without committing to any particular set of features. The latter method is used for example by modern web browsers and Ubuntu.

We more or less followed the timebox method during stable releases 1.9.10–1.14, having a roughly three month release cycle for each version. However, during the 1.15 unstable builds the schedule was first extended and then continually delayed. It eventually became clear that we had abandoned the notion of timeboxing completely.

It requires the correct mindset to do timeboxed releases: one has to accept that any given release is not perfect and it will have known bugs (and many unknown bugs yet to be reported). It can only be as good as one can make it inside the given time window. The only crucial requirement is that each release is an improvement from the previous one — regressions cannot be tolerated. Correct prioritization of work thus becomes very important since there is only a fixed amount of time available in each cycle.

One should not mix and match the feature driven and timeboxing methods. A project may certainly switch their release process to match its needs, but not fully committing to either one leads to prioritization conflicts. This is essentially what has ailed Doomsday in the past: we were nominally doing stable releases every three months, accumulating a decent amount of progress in each release, but failed to set up any incentive for bug fixing. During the three months we got too deeply into the “flow”, and then had to invest a lot of effort into finalizing things for release, fighting to avoid delaying the cycle too much. It was impossible to do this and fix additional, unrelated bugs. One either needs to go all in with the feature driven mindset and really write code according to what’s “Right” at every opportunity — owning up to the fact that the developer is being prioritized over the end user — or strictly follow timeboxing, not trying to push any particular piece of code into a release if it isn’t ready. This frees up the stress that would hinder the making of stable releases, however it causes stress when one has to manage multiple long-running work branches in a granular fashion.

I would argue that our three month cycle was too long because it allowed one to get lulled into thinking that a particular feature or fix must be completed before the next stable release can be made. For instance, a shorter one month cycle would have made it impossible to finish large tasks during a single cycle, facilitating a clearer divide between unfinished work and the master branch, and supporting division of tasks into smaller, more easily digestible parts.

One has to accept that it is impossible to fix all the issues for the next stable release. Even if you fix all the known bugs, taking months or years in the process, there are still the unknown bugs and regressions that can only be discovered with prolonged use. Also, when you fix one bug there’s a chance that you inadvertently introduce two new ones. Thus withholding stable releases will only increase the amount of code that won’t get subjected to wide-spread use/testing, leading to increased risk of regressions.

The feature driven method has its own advantages. One has the luxury to implement ambitious plans and handle all the “domino pieces” refactoring cases where one change necessitates changes elsewhere. Not having to compromise feels good. If constrained by a timebox, these cases would have to be handled in a lengthy side branch or simply worked around with various kind of ugly temporary kludges. The feature driven model of working is attractive to a hobbyist because it allows one to fully immerse oneself into the tasks that seem most interesting, following the most elegant flow of progression, without time pressures. Naturally releases still have to be planned, but any work can be justified if it serves the objectives of the next release. However, a problem arises when the work becomes so indirectly related to the next release that progress is effectively halted. This easily leads to years-long stagnation between stable releases.

Feature based releases are non-agile by nature. Especially when you’re in the middle of implementing a large set of changes, it can be difficult to switch tracks to address other concerns like regressions in the latest stable release. It can even be laborious in practice to switch back to a stable branch for debugging and testing. From the users’ perspective this can understandably be very frustrating: here’s a game breaking known bug and the developers are not doing anything about it (yet)! The conflict is rooted in the prioritization effected by the release model, which favors the developer over the end user. It can be worked around with proper engineering discipline, but a hobbyist might not have the time or the energy for it.

On the other hand, timeboxing is an agile process. It supports the notion that prioritization should be fluid and that topics should be re-evaluated periodically as circumstances change. Keeping the boxes shorter makes the process more flexible.

Then there’s the aspect of fixing known bugs. In a project with real world users, this should be given high priority. In a hobby project, however, it conflicts with the developers’ own interests and motivations, particularly if the bugs in question can’t be reproduced (easily) or are outside the usage scenarios deemed most relevant. While it can be satisfying to solve a difficult bug, it can feel daunting and confusing beforehand. A hobbyist is motivated by the positive feelings gained from the work, and tends to avoid facing the negative ones. Who would deal with off-putting stuff in their free time without due compensation?

How do bugs ever get fixed, then? Based on my observations, many fixes start from random occurrences: either seeing a particularly colored comment from a user, just noticing something odd in the code while working on other things, or seeing a “low hanging fruit” kind of bug report. This leads me to believe that, in a hobby project, bug fixing is fundamentally serendipitous: while you can set out to plan fixes for certain known bugs, this usually involves heavy doses of the “feature driven” thinking of doing the Right Thing and involves more refactoring and redesigning than a pure bug fix would require. After all, hobbyists can afford to be idealistic and be more interested in improving the status quo as a whole rather than just addressing one small detail.

In the past I have been grudgingly allocating some time for bug fixing before each stable release, fully knowing that this eats into my motivational reserves. This has not really worked out well for us, particularly now that the gaps between stable releases have widened significantly. Unless bug fixing is somehow naturally supported by the release process, it simply will not have high enough priority to match the users’ expectations.

The only process I can see innately supporting bug fixing is the timeboxed one. Just like the regular and frequent unstable builds help keep the master branch in check when it comes to the latest work, regular automated builds from the stable branch should be done so that the stable variant has a pulse. The symmetry is beautiful.

Of course, the stable branch should be treated with due deference: any kind of regressions in the stable releases are unacceptable. The safeguard here is that the stable branch is nothing more than an older, adequately tested version of the master. (Hot-fix patches notwithstanding.) Normally the stable branch is not tampered with directly, protecting it from the accidental breakage that the master is always subject to.

In the feature release mindset, each version needs a compelling raison d’être: a marketing pitch, essentially, for why the user should upgrade (as if upgrading was a chore). Otherwise, why make a release when nothing has really changed. This makes sense for traditional commercial software but is less suitable for open source hobby projects. What’s worse, this line of thinking ignores bug fixing completely, relegating it to some future point in time when other important tasks have been addressed. What’s worse, bug fixes become “features” to be scheduled and marketed, taking away their urgency. Instead, getting rid of bugs should always have a very high priority for the sake of respecting the end users, and should occur outside the normal release cycle.

I believe that the correct way to tackle bug fixes is to have a short stable release cycle, for instance four weeks, and do fixes when the opportunity presents itself. Since long-term work is unlikely to be finished in each month’s release, there are many releases where there literally would be no changes unless the developer commits a bug fix or two. This creates a healthy psychological pull to commit pure fixes. The end users also have a clearer understanding of the availability of patches: at least once per month. Also, this method enforces the correct mindset for doing the fix: the stable branch is no place for refactoring, but one-liner corrections are fine and in many cases are enough for working around a crash or some other malfunction. Naturally this doesn’t mean everything is perfect after the fix: long-term work may still be required to properly address the root causes of the bug. Bugs are often merely symptoms, after all. You take medicine to help with an ache but surgery may still be needed down the road.

For now, though, with the old three month cycle gone, Doomsday lacks a credible release process. The next stable version will be available when it’s done.