Uber Open Sources a New Framework for Designing Optimal Statistical Experiments

Original article was published on Deep Learning on Medium

Uber Open Sources a New Framework for Designing Optimal Statistical Experiments

OED enables the scoring and optimization experiments using Pyro’s probabilistic programming model.

Source: https://eng.uber.com/oed-pyro-release/

Rapid experimentation is a key element of modern software development. The raise in popularity of machine learning, has brought up the importance of robust experiment design to the forefront of software development methodologies. From data-noise reduction to hyperparameter optimizations, experimentation is a key element of the lifecycle of machine learning programs. However, designing optimal experiments for a given problem remains a relevant challenge. Recently, Uber open sourced the optimal experiment design(OED) framework built on top of its Pyro probabilistic programming language.

Designing optimal experiments is nothing new in the machine learning space but the relevance of this field has certainly increased in the last few years. The difficulty of experiment design is that it covers seamlessly diverse tasks such as designing an online survey or finding the optimal point in a learning cycle. Modeling those diverse set of problems under a consistent structure requires a flexible methodology. Bayesian optimal experiment design(BOED) provides mathematical abstractions based on probabilities that enable the interpretation of data and observations during an experiment. At a high level, BOED looks established Bayesian model for a given experiment and then looks to maximize the expected information gain(EIG) based on its execution.

Uber’s new OED framework was inspired by some recent advancements in BOED research.

The Research

The research around BOED has been very active in the last few years. Uber itself have been immersed in a series of research efforts to advance BOED methods. Specifically, there were two research papers that serve as the core inspiration for the new OED framework.

In Variational Bayesian Optimal Experimental Design, researchers proposed four variational methods to estimate the EIG in experiments. Each method is based on different modeling assumptions that together cover the majority of real world experiments. This paper explores the advantages and disadvantages of each estimator as well as its practical applicability to different types of experiments.

In A Unified Stochastic Gradient Approach to Designing Bayesian-Optimal Experiments, researchers proposed a stochastic gradient ascent approach to BOED. The proposed method tries to address the challenge of scoring all possible experiments in a design space and selecting the best one. The use of gradient ascent allows the selection of the optimal experiment without having to visit the entire design space.

Built on Top of Pyro

In order to implement the new BOED techniques, Uber required a probabilistic programming stack. Luckily, they had already built and open sourced one of the best in the market. Pyro is a deep probabilistic programming language(PPL) released by Uber AI Labs. Pyro is built on top of PyTorch and is based on four fundamental principles:

  • Universal: Pyro is a universal PPL — it can represent any computable probability distribution. How? By starting from a universal language with iteration and recursion (arbitrary Python code), and then adding random sampling, observation, and inference.
  • Scalable: Pyro scales to large data sets with little overhead above hand-written code. How? By building modern black box optimization techniques, which use mini-batches of data, to approximate inference.
  • Minimal: Pyro is agile and maintainable. How? Pyro is implemented with a small core of powerful, composable abstractions. Wherever possible, the heavy lifting is delegated to PyTorch and other libraries.
  • Flexible: Pyro aims for automation when you want it and control when you need it. How? Pyro uses high-level abstractions to express generative and inference models, while allowing experts to easily customize inference.

These principles often pull Pyro’s implementation in opposite directions. Being universal, for instance, requires allowing arbitrary control structure within Pyro programs, but this generality makes it difficult to scale. However, in general, Pyro achieves a brilliant balance between these capabilities making one of the best PPLs for real world applications.

OED in Pyro

OED leverages Pyro’s programming model for experiment design and optimization. Essentially, Pyro OED structures any experiment in three key stages:

1) Design: Model the controllable aspects of the experiment.

2) Observation: Run the experiment and collect the relevant data points.

3) Inference: Analyze the collected data and update the underlying experiment.

To illustrate how Pyro OED implements these concepts, let’s take the example of a famous psychology experiment that evaluates the memory capacity of a participant based on the longer string of random digits that he is able to remember. In that setting, the three stages of the experiment can be seen as follows:

1) Design: Select the length of the list that we would like the participants to remember.

2) Observation: Present the participants with lists of random digits and record the instances in which they can remember it or those in which they can’t.

3) Inference: Use a logistic regression model to estimate the probability of a person with certain memory capacity to remember a given list of digits.

Source: https://eng.uber.com/oed-pyro-release/

The following code illustrates the design of that experiment in Pyro OED.

Source: https://eng.uber.com/oed-pyro-release/

The experiment starts by scoring all possible designs and selecting the best one. This is done using the EIG estimators explained in the previous section. After that, the code records the user response using the human_response function and updates the probability using the run_inference method.

The key element of Pyro OED is the use of the EIG estimators to select the optimal design. Let’s look at the following code:

#Step 1

# This line allows batching of designs, treating all batch dimensions as independent
with pyro.plate_stack("plate_stack", design.shape):

# We use a Normal prior for theta
theta = pyro.sample("theta", dist.Normal(torch.tensor(0.0), torch.tensor(1.0)))

# We use a simple logistic regression model for the likelihood
logit_p = theta - design
y = pyro.sample("y", dist.Bernoulli(logits=logit_p))

return y
#Step 2
eig = nmc_eig(model, design, observation_labels=["y"], target_labels=["theta"], N=2500, M=50)#Step 3
designs = torch.stack([design1, design2], dim=0)

After a model design is setup, we can select a specific EIG estimator(step 2) or simply estimated it across a grid of designs (step 3).

The open source release of Pyro OED will allow data scientists to rapidly model and optimize experiments using a consistent programming model. By building on top of Pyro, the OED experiments can take advantage of an entire probabilistic framework as well as easily integrate into deep learning stacks such as PyTorch or TensorFlow.