A/B Testing at Scale
Building an experiment "Portal"
The Problem
A/B testing is supposed to be fast. Spin up a variant, measure the impact, ship the winner. Though it can often not feel that way.
In a recent experience, I lead a team where we were staring down the barrel of running many A/B tests using Optimizely Edge — a mechanism where arbitrary JavaScript is injected directly into the browser.
This worked for simple experiments, but the cracks showed quickly. There’s primitive versioning, no tests, no build system, no type safety... the list goes on. A snippet lived in a dashboard somewhere, detached from the codebase.
Introducing the experiments into various codebases via Optimizley Fullstack was worse. Our site was made up of multiple “fragments” — full-stack micro-apps, owned by multiple teams. Coordinating an experiment that touched more than one fragment meant coordinating releases across teams. That overhead killed momentum and required lock-step across teams. And when the experiment concluded? The clean-up could be a manual, error-prone trawl across codebases — leaving dead code behind.
The solution? Enter… the “Portal”
The Portal was our custom experiment delivery mechanism. The concept was simple:
A <script> tag was included on the page with the context it needed as query params — things like isMobile, optimizelyVisitorId, someCustomValue, etc:
<script defer src=”/experiments.js?isMobile=true&someCustomValue=abc123&optimizelyVisitorId=ABCDE”></script>That request hit our own endpoint at a Node server. The endpoint called Optimizely Fullstack with the visitor Id to determine which experiments (taking into account audiences and exclusion groups) should be active for that user. It then built a JavaScript string — scoped to those experiments — and returned it to the client.
Browser → script src request → Node endpoint → Optimizely Fullstack → tailored JS response
The key shift: instead of arbitrary code floating in a third-party dashboard, experiment logic lived in a proper repository, built, reviewed, tested, and deployed like any other service.
The Developer Experience
All experiments lived in a single codebase and scaffolding a new experiment was a single command:
npm run experiment:createThis scaffolded the full structure — components, experiment entrypoint, server endpoint, translations, constants, etc — so every experiment started from the same consistent shape.
Lessons and Takeaways
Even trivial experiments are production code
The biggest mindset shift. An A/B test isn’t a throwaway snippet — it runs on your live site, seen by real users. Treating it with the same rigour as production code (TypeScript, linting, Storybook, code review) catches bugs before they affect the experiment results, and makes the codebase comprehensible to anyone who touches it later.
The start of the product lifecycle
This meant experiments were the first part of the lifecycle and could be easily migrated to other codebases when complete.
Tradeoffs worth knowing
Our approach meant the script was the last thing to load on the page, which introduced cumulative layout shift — a real concern for pages where the experiment modifies visible content above the fold. It’s not ideal for wholesale page redesigns, where the jump from original to test variant would be too jarring for the user.
Ultimately, the tradeoff was worth it for us. We released experiments independently, cleaned up confidently, and held our test code to the same high standards as other teams.


