Flipt: Continuous Configuration

Overview

Flipt is an open-source feature flag tool that we use for backend feature-flags. For clarity, we refer to this capability as 'continuous configuration' or 'Fipt flags', not 'feature flagging' to distinguish between Flipt (which does not involve userId/organizationIddeterminations) and PostHog, which is used for frontend feature flags based on the actual user and organization context.

Continuous configuration is useful for dark deployments and canary deployments, where a new or revise code path in a service is either entirely disabled or only enabled for a certain percentage of service instances. This is a crucial capability for us, as we practice continuous deployment and therefore deploy to production for every single PR, as soon as it is merged.

Continuous configuration is distinct from environment variable configuration for long-term service config that the service relies on to even start such as database connection secrets. Flipt should not be used for values that are necessary for a service to run nor for secrets. It is specifically for the purpose of changing service behaviour on the fly, without restarting/reploying a service.

A key use-case is rollbacks. Instead of actually rolling back a service's container image, we can simply toggle the Flipt 'rollout' for a given 'flag' to false, and suddenly the service has rolled back to its previous behaviour before the (seemingly buggy) new code was introduced.

Similarly, continuous configuration allows us to separate deployment of new code to production from deployment of new logic to production. As long as developers properly use Flipt to contain major new service behaviour behind a Flipt flag, we can continuously deploy code without necessarily activating that new code path at the same time.

Flipt has their own docs (opens in a new tab) which are a good starting point, as while we run Flipt ourselves on ECS, we haven't forked the source code and run it just as it is out-of-the-box.

Services can use the Go (opens in a new tab) and Node (opens in a new tab) SDKs to integrate Flipt.

API

From the consuming service standpoint, the Flipt API is exactly as documented in the Flipt docs.

For developers looking to configure Flipt rules for their service, it's key to understand that our deployment of Flipt does not use a database, so we don't use the Flipt API to configure our flags. Instead, Flipt polls a dedicated S3 bucket and caches the values returned in memory.

The YAML files in that bucket conform to the Flipt configuration file schema and are pushed to the bucket from the pivot-internal GitHub repository, using Github Actions. Our Flipt instance(s) poll that S3 bucket every 15 seconds, so it takes about 45 seconds (30 seconds of GitHub Actions time included) after a commit to main that modifies our Flipt configuration fliles for the changes to take effect in Flipt. Services then poll our Flipt instance(s) every 15 seconds, bringing total time to roughly 60 seconds from commit to a change in service behavior. This means we can globally rollback in a minute for many types of issues with no rolling restarts required and in fact no role for the ECS control plane at all.

Deployment

Flipt runs in our ECS cluster. It is stateless and just needs access to read from S3. Consuming services can reach it using ECS Service Connect.