Ready to learn Machine Learning? Browse Machine Learning Training and Certification courses developed by industry thought leaders and Experfy in Harvard Innovation Lab.
The term Lean startup was coined about ten years ago. Since that time it has grown to become one of the most influential methodologies for building startups, especially those that fall in the category of web-based software companies. Lean came of age during the internet revolution. We now sit on the cusp of a different revolution — one ushered in by machine learning algorithms. It is safe to assume that most or all software in the near future will contain some element of machine learning. But how compatible is Lean with machine learning, in principle and in practice? (See here for an interesting perspective)
Validated learning and machine learning
According to the Lean startup philosophy, most startups fail not because of the quality of the product but because of incorrect assumptions about customers, i.e. startups end up building the perfect product for a non-existent customer. In order to avoid this fate, startups should first validate and refine their assumptions by running experiments to learn about their customers. This process is called validated learning. In Lean, validated learning is accomplished by performing repeated hypothesis testing.
The hypotheses are always assertions about the response of the customers given the functional definition of the product. In order to conduct an experiment, one would first need to define the product functionally (basically a list of features) — this defines a viable product. In addition, one also needs to define the quantitative measures of the customer’s response to the product. The functional product definition still leaves many (primarily technical) choices open. These choices should be made in a way that minimizes the expenditure of the most valuable commodity in a startup (usually time); such choices define the minimum viable product (MVP). One would then proceed to build the MVP, bring it to the customers, record their response, and determine the validity of the hypothesis. The experiments are usually set up as split tests with the current baseline product serving as the control and the proposed change serving as the treatment.
Notice, that there is no scope for product side uncertainty in Lean — given a functional definition, one knows with certainty whether a product that meets that definition can be built or not. This is true for traditional software. The behavior of a traditional software product is completely determined by the instructions you provide (i.e. the code). Therefore, given a full specification of its behavior you can work “backwards” and figure out what instructions you need to provide to build a piece of software with the desired behavior. An MVP in some sense is basically the minimal set of instructions you need for this purpose. A behavior of a piece of software can exhibit different variations based on the context. However, the number of such possible variations is usually rather small, and it makes sense to provide a functional definition as the sum total of all such variations.
However, machine learning is fundamentally different. In machine learning, the product is a model which is generated by combining a set of instructions (an algorithm) with some data (training data). The behavior of the model is thus determined not just by the algorithms but also by the training data. In machine learning, it is not possible to make precise assertions about the behavior of a product in advance (before having access to the training data). In addition, machine learning algorithms learn complex rules from massive amounts of data. The set of possible variations of their behavior based on context is very large. Thus, even if one had access to the training data, it still not feasible or sensible to functionally define a machine learning product by manually listing all possible variations in its behavior — if you already knew exactly what an algorithm would learn from data then you would not need to train the algorithm in the first place.
This presents a basic problem for validated learning when it comes to machine learning products. Simply stated, an experiment cannot be used to test the validity of a hypothesis if the preconditions for the hypothesis cannot be guaranteed to have been satisfied — one would not be able to tell whether the test failed because the hypothesis is invalid or because the preconditions were not met. In statistical hypothesis testing, this is known as confounding.
In their unvarnished form, Lean experiments are not suitable for validated learning in the domain of machine learning or any other domain with significant product side uncertainty. As we discussed above, the root cause of this conundrum is due to the fact that Lean proposes hypotheses validation via repeated split testing as the means of validated learning.
Split testing was the MVP of validated learning
The goal of validated learning is really to understand the customer from empirical evidence. Split testing is an incredibly restrictive tool for that purpose. So why was it chosen in the first place? I believe, this is a case of the Lean startup methodology finding its own MVP.
When Lean startup was created as a concrete methodology, its primary “customers” were web-based software companies. In this domain, split testing is the ubiquitous tool for product optimization (making small changes to the product given that you know your customer very well). This familiar tool was simply repurposed for validated learning.
At that time, the most visible domains with significant product side uncertainty were mostly moonshots such as finding a cure for cancer. It was unclear then, as it is now, whether Lean principles are useful for such moonshots. Thus, it it made very little sense in complicating the formalism with less well known tools. In other words, split testing is the MVP of validated learning.
It is time for a pivot
Machine learning changes everything — the primary goal of software is changing from executing instructions to learning from data. For the Lean startup methodology this represents a dramatic shift in its most important customer segment. The incompatibility of split testing with machine learning products means that soon it will no longer remain a “viable product” for the purpose of validated learning. It is time to look for other options; it is time for a pivot.
For inspiration, let us turn to the original source of inspiration for validated learning — scientific research. Lean borrows much of its methodology of learning through experiments from the scientific method. In science, however, experiments are expensive. Using experiments as a means of statistical hypothesis testing is only ever done in few cases, such as when one needs approval from a supervisory body (e.g. clinical trials for drugs) or when there is no real underlying theoretical foundation.
In more mature fields of science one usually does not move from experiment to experiment trying to validate hypotheses, but instead one tries to build theories. I will provide a simplistic description of this process in the following paragraph. What follows is somewhat of a caricature; not entirely accurate, but serves our purpose.
A theory is basically an explanation of a wide range of experimental facts. But usually the simplest explanations are not obtained in terms of objects that are directly observable but rather in terms of certain conceptual objects. For example, atomic theory can explain a variety of experimental facts. However, what you actually observe in experiments are not the atoms themselves but some blinks on a detector. Now we need a way, a map, to connect the atoms to the blinks. The map is the bridge between the theory world (consisting of atoms) and the observable world (consisting of blinks).
The map would be used to convert the results from the theory into predictions about observable quantities for that specific experimental setup. If the predictions from the theory are consistent with all existing observations and there are some new predictions, then new experiments might be commissioned depending on the importance of the new predictions. If any of the predictions from the theory is found to be inconsistent with existing observations then the theory needs to be modified.
Maps such as the one above are not simple tables, but sophisticated entities. They are accurate models of the experimental setup; such as a detector in the above example. In fact, detector physics is a sub-field on its own.
An important question to ask is why do we need such maps? What don’t we simply go from theory to experiment? There are many reasons for that, but the most important one, and the one most relevant for our discussion is that these maps provide a separation of concerns. The goal of a theory is to provide the simplest possible explanations; one should be able to test the validity of a theory with experiments conducted in Shanghai, San Francisco or Sao Paulo, without having to worry about the details of each experimental setup. On the other hand, the role of experiments is to be able to compare multiple competing theories trying to explain the same phenomenon. If the burden of including the details of each experimental setup was placed on the theory itself, then none of the above would be possible. Placing a map between theory and experiment provides the necessary decoupling between the theory world and the real world.
Now in the above discussion if we replace theory world with the product world and the real world with the customer world, we will immediately notice the similarities with validated learning. The goal of validated learning is to find a map connecting product configurations with customer responses. For the lack of a better term I will simply call this the customer map. Finding the product/market fit is equivalent to finding a good first approximation to the customer map. During the product optimization phase one uses this customer map to build the right products for the right customers. The customer map is also further refined during the product optimization phase.
A new MVP of validated learning
How does one go about building a customer map? In general, it is terribly inefficient to conduct an experiment for each product configuration for which one wants to know the customer response. This is, more or less, the approach that repeated split testing takes. What is needed instead is a tool that can generalize from data. As it so happens, in the decade since Lean startup was introduced, we have perfected a tool that does just that — this tool is called machine learning.
Machine learning is the perfect tool for validated learning in a Lean startup. To make this claim more concrete let me provide an example of how this might work conceptually. As we already discussed, a customer map is supposed to connect product configurations to customer responses. A recommender system is a machine learning algorithm that does just that. The quality of the mapping in the recommender system will obviously depend on the quality of the data that was used to train it. Thus in the first phase the product configurations should be chosen to reduce the uncertainty in the mapping, while in the latter phase we use the map to generate the best product configurations given a customer. This is precisely the problem that contextual bandits solve. Contextual bandits is a reinforcement learning algorithm. Reinforcement learning is a sub-field of machine learning. In the contextual bandit algorithm the two phases are known as explore and exploit. Thus, recommender systems in combination with contextual bandits can be a tool for validated learning. This setup is a far superior replacement for repeated split testing.
The approach to validated learning that I just described above is fully compatible with machine learning products because it does not require one to set up hypotheses that are free from product side uncertainty. As such, it does not have the problem of confounding inherent to repeated split testing.
The approach also does not require functional definitions of products for it to work. The product configurations in this approach can simply be technical specifications. For machine learning algorithms this will amount to choices of algorithm types, parameter choices, training datasets etc. Nevertheless, it is useful to have a functional definition of the product to understand its relationship with the customer. And indeed, it is possible to provide an functional product definition operationally within the approach. The functionality of a product is fully defined by the responses it generates when used with the current customer map — essentially this is our best guess our how the customers will react to the product. This is not a list of features as in a traditional product definition. But it is potentially far more useful because it can be analyzed using the tools of behavioral analytics to gain insight into how customers perceive the product.
To summarize, machine learning products are fully compatible with the principles of Lean. However, to realize this compatibility in practice we need to abandon some of the antiquated tooling that is used in validated learning and replace them with modern machine learning tools. The central entity that emerges during this migration is the customer map whose determination becomes the primary goal of validated learning, replacing hypotheses validation via repeated split testing.