Articles
By Ada Wang and Hao Ai, Microsoft Experimentation Platform For years, Microsoft’s experimentation platform (ExP) (opens in new tab) has been the backbone of running A/B experiments at a global scale, analyzing results and enabling data-driven decisions for Microsoft products…
“It is difficult to make predictions, especially about the future” – Yogi Berra (perhaps apocryphal) How well can experiments be used to predict the future? At Microsoft’s Experimentation Platform (ExP), we pride ourselves on ensuring the trustworthiness of our experiments.…
By Sinem Akinci, Microsoft Developer Division and Cindy Chiu, Microsoft Experimentation Platform Generative AI [1] leverages deep learning models to identify underlying patterns and generate original content, such as text, images, and videos. This technology has been applied to various…

The Experimentation Platform at Microsoft (ExP) has evolved over the past sixteen-plus years and now runs thousands of online A/B tests across most major Microsoft products every month. Throughout this time, we have seen impactful A/B tests on a huge…
Over the past year, excitement around Large Language Models (LLMs) skyrocketed. With ChatGPT and BingChat, we saw LLMs approach human-level performance in everything from performance on standardized exams to generative art. However, many of these LLM-based features are new and…
A/B Interactions: A Call to Relax
If you’re a regular reader of the Experimentation Platform blog, you know that we’re always warning our customers to be vigilant when running A/B tests. We warn them about the pitfalls of even tiny SRMs (sample ratio mismatches), small bits…

Deep Dive Into Variance Reduction
Variance Reduction (VR) is a popular topic that is frequently discussed in the context of A/B testing. However, it requires a deeper understanding to maximize its value in an A/B test. In this blog post, we will answer questions including:…
An “event-based” A/B test is a method used to test two or more variables during a limited duration. We can use what we learn to increase user engagement, satisfaction, or retention of a product, while also applying our insights to…
STEDII Properties of a Good Metric
Good metrics enable good decisions. What makes a metric good? In this blog post we introduce the STEDII (Sensitivity, Trustworthiness, Efficiency, Debuggability, Interpretability, and Inclusivity) framework to define and evaluate the good properties of a metric and of an A/B…