Feature Flags (I)

This is the first of a set of posts about Feature Flags. These posts will cover: Feature Flag definition, creation, implementation, testing, rollout, cleanup and finally a proposal of DevOps Pipeline using Feature Flags.

Intro

Feature flags (aka toggles, flips, gates, or switches) are a software delivery concept that separates feature release from code deployment. In plain terms, it is a way to deploy a piece of code in production while restricting access—through configuration—to only a subset of users. They offer a powerful way to turn code ideas into immediate outcomes without breaking anything in the meantime.

The fundamental concept behind feature flags—choosing between two different code paths based on some configuration—has probably been around almost as long as software itself.

In the 1980s, techniques like #ifdef preprocessor macros were commonly used as a way to configure code paths at build time. They were primarily used for supporting compilation to different CPU architectures, but were also commonly used as a way to enable experimental features that would not be present in a default build of the software in question.

Although these preprocessor techniques supported only the selection of code paths at build time, other commands like commandline flags and environment variables have been used for decades to support runtime feature flagging.

It was common for teams to have feature launch tied to code release. Long-lived feature branches were used and when merged into the main/trunk and pushed to production, new features would be launched to customers, i.e. deploy and release was the same action.

Around 2010, a software development philosophy called Continuous Delivery (CD) was beginning to gain traction, centered on the idea that a code base should always be in a state that it could be deployed to production. Enterprise software teams were embracing Agile methodologies and using CD concepts to support incremental deployment into production, being able to deploy to production multiple times a day.

To achieve these incredibly aggressive release cadences, teams were throwing away the rulebook, abandoning concepts like long-lived release branches and moving toward trunk-based development. Feature flagging was a critical enabler for this transition. Teams working on a shared branch needed a way to keep the code base in a stable state while still making changes. They needed a way to prevent a half-finished feature going live to users in the next production deployment. Feature flags provided that capability, and became a standard part of the CD toolbox.

Around the same time a parallel revolution emerged in product management. The Lean Startup methodology brought a heavy focus on rapid iteration (powered by CD, in fact) along with a scientific approach to product management using A/B testing.

Software delivery teams quickly realized that they could use the same feature flagging techniques that they had been using for release management to power their A/B tests. Though this was a great enabler for rapid product development.

Today, feature flagging systems are seen primarily as a tool for feature management.

What Is A Feature Flag

Adoption of Feature flags has grown over the past few years as more and more engineering organizations are discovering that feature flags allow faster and safer delivery of features to their users by decoupling code deployment from feature release. Feature flags can be used for operational control, enabling kill switches that can dynamically reconfigure a live production system in response to high load or third-party outages. Feature flags also support continuous integration/continuous delivery (CI/CD) practices via simpler merges into the main software branch.

Feature flags are used for the process of branching your code, which means to turn certain code paths on and off to separate the release of new features from the actual deployment of our code.

This branching of the code is a great software development practice for two reasons: Firstly, developers can hold the release of work-in-progress features until they are ready; ensuring that other finished features are not delayed. Secondly, developers can work directly in a feature without changing main code.

When a feature is finished and all changes have been released, turning on the feature toggle releases the feature, just like a switch.

To understand feature flags in the simplest term, they are “if/else” statements to develop a conditional branch or pathway into your code:

if(featureFlag) {
    //new feature
} else {
    //current logic
}

By wrapping new feature code blocks with feature flags, developers can merge new code into the main trunk without affecting the release as a whole. New features can be activated selectively and their effect on the overall platform is monitored. So, deployment (changes are installed into the production environment) is detached from release (when this changes are activated by enabling the flag).

However, a feature flag can be more than a boolean. A feature flag can have percentages, segments, and be as complex as needed to help not only with release management (canary releases, dark launches or progressive rollouts) but also with long-term control (access level control rights, different customer one-offs, and behavioral control).

Feature flags, if used correctly, can superpower development, allowing developers, ops, qa, product and customer support to bring better features to market, faster.

What’s more, feature flags enable a culture of continuous experimentation to determine what new features are actually desired by customers. For example, feature flags enable A/B testing, showing different experiences to different users and allowing for monitoring to see how those experiences affect their behavior.

Feature Flags improve Software Development

Feature flags have revolutionized software development by allowing features within a software product to be individually enabled or disabled at any point, even after they’ve already been rolled out to customers. Let’s look at some of the most important ways in which they can improve your ability to develop, test, and deliver new features while minimizing risk throughout the process:

Enable controlled rollouts: allowing releasing new features gradually (for a set of users for example).
Centralize Visibility and Control: using same feature flag system across the entire organization, gives you immediately ensure a level of safety and resilience.
Enable Testing in Production: sometimes testing a feature in staging is not possible. Since feature flags allow continuous deployment of code directly into production, you can perform usability testing in production as well, enabling the new feature for beta testers initially, verifying the new code in production before a single customer is impacted by the changes.
Give You a Kill Switch: In case of problems detected, anybody in the team can deactivate a feature flag. This is not a rollback, no code review is required, just an off switch.
Separate Code Deployment from Release: Code deployment and software rollout do not need to occur simultaneously anymore.
Improve decision taking: It is human nature to make bad decisions while under stress. The brain goes into “fight or flight” mode, flooding the frontal cortex with hormones. This is good if one needs to quickly run away from a lion. This is bad if one needs to calmly walk through a complicated decision tree. With feature flags, we can ideally turn off a bad feature, and at our leisure, figure out what went wrong.
Allow Product to be responsible for when a feature goes live: The stress is taken off developers of having to do another code deployment to release their features. Instead, product owners can easily log into their feature flag framework tool, and with the push of a button, turn the feature on.
Enable Localization and Internationalization: If localization and internationalism matter for your platform, you can easily build that into your feature flag structure and only enable features in the appropriate regions, serve up language variations for your platform, or solve really any other demographic-specific need.
Simplify Migrations: Don not push code changes at the same moment you’re ready to switch over to a new database or backend service, with feature flags, you can test if first in production.
Enable Experimentation: Feature flags also enable experimentation, like A/B testing.
Make your system more resilient: Feature flags can be used to react under performance peak load (for example by reducing functionality to the minimum required to maintain the service) or to help investigating an issue (by changing the logging level to obtain more info).

Feature Flag Categories

Feature toggles can be categorized across two major dimensions: how long the feature toggle will live and how dynamic the toggling decision must be.

Let’s consider various categories of toggle through the lens of these two dimensions and see where they fit.

Release

These feature flags allow in-progress features to be checked into the main branch, while still allowing the branch to be deployed to production at any time. Release toggles allow incomplete and un-tested codepaths to be shipped to production as latent code which may never be turned on.

By definition they are transitionary. Once delivered (activated) and checked that feature is stable, feature flag logic should be removed (leaving new logic only).

Experiment

They are very similar to Release feature flags. Experiment toggles are used to perform multivariate or A/B testing.

An Experiment needs to remain in place with the same configuration long enough to generate statistically significant results. Depending on traffic patterns that might mean a lifetime of hours or weeks.

Ops

These flags are used to control operational aspects of our system’s behavior. A system may have a set of Ops toggles, which allow operators of production environments to degrade non-vital system functionality when the system is enduring unusually high load.

These flags may be left in place for operators almost indefinitely.

Permissioning

These flags are used to change the features or product experience that certain users receive. For example, we may have a set of premium features, which we only toggle on for our paying customers. Or perhaps we have a set of alpha features, which are only available to internal users.

It is better to implement this kind of functionality in another layer of the application (authorization, roles…).

This post is focused in the first category described: release feature flags.

Risks

Ambiguous/Reused Flag Names

A flag should have a clear, well-understood name. A flag named “user_control” is a good candidate for misunderstanding. A backend team once thought that a given flag controlled the functionality they were using. However, unknown to them, another frontend team had also reused the flag to gate some of their own functionality. The two teams started flipping the flag based on where they wanted it to be. Like two people controlling a light switch, the flag was never in the right state.

Check the case of Knight Capital, a company that went bankrupt in 45 minutes because of a failed deployment (that activated code of an old feature with same toggle name).

More Feature Flags Equals Complex Code

Too many feature flags can quickly turn your code into a feature flag hell. The more feature flags you create, the more code paths are going to be created, this increases the complexity of your code and makes the testing process harder. A higher number of code paths can cause a mess and risk the functionality of your code.

It is important to work with a limited number of toggles.

Overlapping

In case two or more feature flags overlaps, developer has to decide how to implement this overlapping. Checking dependencies (if one feature is required to deliver the other feature) may help deciding the approach.

Testing will also be affected depending on the approach taken by the developer.

A high number of in-progress feature flags increases probability of collision among them.

Technical Debt

Like other sources of technical debt, feature flags are cheap and easy to add in the short term. But the longer that they are left in the code, the more that they will end up costing you.

If not managed properly, feature flags can be very destructive technical debt.

In case of short-term flag it is very important to know when is complete or obsolete. Ideally, we should be able to see whether a flag is being called by 100% of the users and should be retired. In addition, if a flag is being called by no one, it is also worth following up on why a stale code path is in place. In both situations, flag is candidate to be cleaned up.

Not cleaning all flags (removing unused code) may introduce a broken windows culture in the team, where actively managing your technical debt does not seem worth the investment, creating a vicious cycle.

Log

Over time, it can become difficult to keep track of the changes that are made to a feature flag. Therefore, it is important to track changes through logging, especially when working in teams. Using logging, you can keep a track of who created the feature flag, who made a change and at what time. Use proper commit messages (with defined format) when creating new feature flag and when removing it (cleaning it up).

Visibility

By using feature flags to control functionality, we are creating multiple states of our system. We have to ensure that all user groups (development, QA, operations, support, product, …) know these flags exist.

Ideally, flags should be in a centralized, visible place where they can be controlled independently of code releases and reachable by all user groups.

Key Takeaways

We are responsible for producing quality software predictably, quickly and in a repeatable. Feature flags can help the team to deliver more features faster, and with less risk. We have to make sure that we treat our feature flags as they deserve, not as an afterthought. We also have to ensure that they are manageable, visible, actionable and trackable. Feature flags can be an integral part of modern software development, so we have to give their management first class support.

References

The Split Blog
Feature Toggles (aka Feature Flags)
A Practical Guide to Testing in DevOps, by Katrina Clokie
Feature Flag Best Practices, by Pete Hodgson & Patricio Echagüe
Managing Feature Flags, by Adil Aijaz and Patricio Echagüe

Feature Flags Posts

FeatureFlag