Metrics Platform/Conduct an experiment

An experiment is a test of a hypothesis designed to provide trustworthy and generalizable data. It imposes an intervention on subjects with the intention of observing what outcome that intervention leads to.^[1] This page describes how to create an experiment using the Metrics Platform.

This guide describes the process for creating an experiment using the Metrics Platform Experimentation Lab (also known as MPIC). For the manual process for creating an instrument, see Create an instrument. If you're unsure which process to use, contact Data Platform.

Plan

Experimentation scorecard

[In beta] The experimentation scorecard (restricted access while in beta) provides a template for creating an experiment.

Measurement plan

Instruments collect data about user interactions so that we can answer questions about product experiences. Before you can start collecting data, create a measurement plan (template) that documents what data you plan to collect, why, and how you plan to analyze the data.

You can write your measurement plan in a document or on a Phabricator task, depending on the scale of the project. For examples of measurement plans, see the folder on Google Drive.

Instrumentation spec

Once you have a measurement plan, the next step is to create an instrumentation specification (template). The instrumentation spec defines all the data you'll collect for your instrument. The spec is also a useful tool for engineers to ensure that all events are being produced and received correctly. For a template and examples of instrumentation specs, see the folder on Google Drive. For more information about designing an instrumentation spec, see the instrument guide.

Data collection guidelines

All data collection activities must follow the data collection guidelines. Once you've identified the applicable risk tier, you can use your measurement plan and instrumentation spec to complete the steps in the guidelines under "What should WMF teams do next?".

Code

You can write your instrument code in the WikimediaEvents extension or in your product codebase. See the API docs to learn how to code an instrument.

Experiment enrolment sampling

Experiment enrolment sampling is the act of enrolling users into experiments and consistently assigning an enrolled user a variant of the feature that is being experimented on.^[2]

An experiment enrolment sampling algorithm, therefore, is a function, method, or process that accepts some inputs and returns a variant, i.e.

module Experiments {
    enroll( user: User, experiment: Experiment ): Variant;
}

Where:

user is a token that represents that user for at least the duration of the experiment; and
experiment is one or more constants that define the experiment, e.g. name, sample rate, and variants

Properties

Such an algorithm must:

Ensure consistency of assignment within experiment. For example, if there are two experiments running, both having two variants, then the same user should be assigned the same variant for the same experiment in such a way so as to ensure that the following assignments are equally likely:


Experiment 1	Experiment 2
Variant 1	Variant 1
Variant 1	Variant 2
Variant 2	Variant 1
Variant 2	Variant 2

and not:

Experiment 1	Experiment 2
Variant 1	Variant 1
Variant 2	Variant 2

Be able to sample on a variety of levels, e.g. page, session, user, application install

Sample when needed, e.g. sample when a user visits a specific page and thereby not assign groups to users who never visit that page
Not require a backing store

Caveats

In order for the first and last properties mentioned above to hold, any system using such an algorithm must lock or freeze the inputs to the algorithm for the duration of the experiment. However, it should be OK to extend the end date of an in-progress experiment.

Launch

Once your instrument code has been deployed to production, complete the launch-your-experiment form to configure your experiment and start collecting data.

Monitor

You can use the experimentation directory to monitor the progress of your experiment. Your event data will be available in a Hive table with the same name as the event stream, with dots and dashes replaced with underscores.

Manage

The experimentation directory provides options to edit and turn off your experiment using the Actions menu.

Decommissioning

Instruments that capture metrics related to an experiment should be disabled once the experiment is complete. To decommission an experiment:

Turn off the experiment using the Actions menu.
Remove the instrument code that calls a Metrics Platform client API.

References

↑ mw:Product_Analytics/Glossary#Experiment
↑ This section is based on the discussion in T372108 Document desired properties of an enrolment sampling algorithm

[1] w:Product_Analytics/Glossary#Experiment

[2] This section is based on the discussion in T372108 Document desired properties of an enrolment sampling algorithm

[1]

[2]