Feature Flags Without the SaaS: How We Killed Our Beta Branch
I usually advocate buying over building. With a two-person team, you can't build every tool you need. You have to pick your battles. Most of the time, paying for a solution means you can focus on the actual product.
Feature flags were the exception.
We needed them to kill our beta branch. The branch had become a mess. Weeks of drift from production, merge conflicts that took half a day to untangle, big-bang deployments where everything shipped at once or nothing did. If a release broke, our only option was rolling back the whole thing.
LaunchDarkly would've solved this, but we didn't need what they were selling. No A/B testing. No conversion metrics. No user segmentation. Just: can we ship code to production that doesn't do anything until we say so?
That's a database table and some conditionals. So we built it.
The Model
A feature flag needs to answer one question: is this feature on for this user? Everything else is optional.
Here's what we landed on:
type FeatureFlag struct {
Key string
Enabled bool
Percent int
Default bool
ExpiresAt time.Time
UpdatedAt time.Time
UpdatedBy string
}
Key and Enabled are obvious.
Percent lets us roll out gradually. 10% of users today, 50% tomorrow, 100% when we're confident.
Default is what the flag falls back to after expiry. Almost always false. Kill switches should die in the off position.
ExpiresAt is the interesting one. Expiry dates force cleanup by making expired flags behave differently in staging versus production.
Percent Rollouts Without Infrastructure
When I first thought about percent rollouts, I assumed we'd need sticky sessions or a Redis cache to remember which users got which variant. Otherwise a user might see the feature on one request and off the next.
The solution is simpler: make it deterministic.
Hash the tenant ID and flag key together, get a bucket from 0-99, compare it to the rollout percentage. Same tenant, same flag, same result every time. No storage, no state, no infrastructure.
Including the flag key in the hash means a tenant in bucket 30 for one flag isn't automatically in bucket 30 for every flag. Rollouts stay independent.
The identifier comes from request context. For authenticated requests, we pull the tenant ID from the JWT claims. Consistent tenant, consistent experience.
But not every request has a user. Background jobs, system events, webhook handlers. We had a choice: build infrastructure to track identifiers for these cases, or accept that they'd get random bucket assignment on each run.
We went with random. It's an acceptable trade-off. The percentage still holds in aggregate, and we weren't going to delay shipping over edge cases we might never care about. If consistency for background jobs becomes a real problem, we'll revisit it.
Expiry Dates With Teeth
Feature flags are tech debt. Every flag you add is a branch in your code that you'll eventually need to remove. The "temporary" flag from six months ago is still in the codebase, still checked on every request, still confusing anyone who doesn't know if it's safe to remove.
We solved this by making flags expire.
Every flag gets an expiry date. After that date, the flag stops respecting its enabled state and falls back to its default value (almost always false). The feature turns off automatically.
But the interesting part is what happens in different environments.
In production, an expired flag logs an error and returns the default. Users aren't affected, but we see it in our monitoring. It's a nudge.
In staging, an expired flag returns an error that breaks assertions. Your tests fail. Your local development fails. You can't ignore it.
This creates pressure in the right direction. You feel the pain of an expired flag during development, when you can fix it. Production stays safe. Cleanup becomes part of the normal workflow instead of a task that never gets prioritized.
Where Flags Live
It's tempting to sprinkle flag checks everywhere. A quick if (flagEnabled) in whatever file you're working on. This gets messy fast. When a flag expires and you need to clean it up, you're grepping through the entire codebase hoping you found every instance.
We settled on three deliberate access points:
Frontend: Does this feature appear? Hide a button, swap a UI variant, show a different label. This is a Vue composable that checks flag state on render.
Middleware: Can you get here at all? Route guarding. Someone shares a direct link to /dashboard/new-thing and the UI would half-render even if the button was hidden elsewhere. The middleware check blocks the route entirely.
Service layer: Should this happen? Business logic changes. A new calculation, a different API behavior, an alternate workflow. This is an injected service, which makes it testable and mockable.
Three layers, three questions. When it's time to remove a flag, we know where to look.
Was It Worth It?
We killed the beta branch. That was the goal.
Now we deploy to main continuously. Code ships to production behind flags, gets validated in staging, then rolls out gradually. 10% of production traffic, then 50%, then 100% when we're confident. If something breaks, we flip a switch instead of coordinating a rollback.
The whole system is around 1000 lines across Go services, middleware, and Vue composables. We built it in a couple of days and haven't thought much about it since, which is exactly what you want from internal tooling.
I still think buying beats building most of the time. But sometimes the thing you can buy is designed for problems you don't have. We didn't need experimentation infrastructure. We needed to stop maintaining two branches.
If that's your situation too, you don't need much to fix it.