Flywheel Metrics: The Economics of Machine Learning

I write some things in my newsletter that I don't always publish here on the blog.  My last newsletter contained a brief explanation of why we are about to see machine learning bubble 1.0, but why it is entirely rational.  My framework for thinking about this comes from a book I read this year (which I highly highly highly recommend) called Technological Revolutions and Financial Capital.  The gist of why there will be a bubble is that, it's a relatively safe bet that some massive companies will come out of the machine intelligence space.  My guess is that you get a few dozen billion dollar companies, 5-10 decabillion dollar companies, and one centabillion dollar company.  That distribution seems to match what has happened in other tech revolutions.  But, it is very very hard for people immersed in this space to pick which companies will end up that way, and it's even harder for investors who don't understand the technology or applications very well.  In that scenario, it's entirely rational for a venture fund to throw money at machine learning startups hoping they hit one the decabillion dollar winners.  After all, VC funds aren't mutual funds.  The specific purpose of the money they raise is to take massive risky bets that have huge payoffs if they are right.  Of course, that type of behavior causes a bubble, but the bubble was, from each individual VC's perspective, entirely rational behavior.  

As the bubble rises, pops, and the real companies survive, an economics of machine learning companies will be developed.  I've thought about this quite a bit and decided to take a stab at what the important metrics for a machine learning company might be.

I want to qualify this post by stating that many machine learning companies will just be Application X with machine learning added.  In those cases, the economics of those companies will be dictated more by the Application X space than by the machine learning space.  I want to focus on new companies - companies building new applications that weren't possible before machine learning exploded.

This post will be long enough without considering revenue, so lets look at the cost side for today, and what is different from traditional companies.  

New industries like this sometimes bear slightly higher infrastructure costs and personnel costs, as infrastructure isn't optimized for the new use cases, and qualified personnel are hard to find which drives up compensation.  But both of those trends typically right themselves over time, and the deltas from normal companies are bearable.  So what is different that we have to be worried about?

I think there are two key issues:  data sets, and human intervention.  Initially, machine learning companies will take the low hanging fruit of existing data sets, but over time, entrepreneurs will realize that, if they could just buy/build a dataset about X, they could apply machine learning and have a really valuable product.  That could be really expensive.  One of the key skills of the best machine learning entrepreneurs will be the ability to get creative about how to get data.  Look for companies to track a metric that is something like Cost Per New Data Point.  That metric will be compared to Value Per New Data Point.  In other words, does the product improve enough from a new piece of data that it is worth it to acquire that data?  This will be most applicable to companies that need to build unique data sets.

Entrepreneurs will try to find ways to lower the marginal cost of data acquisition. That trap you will have to watch for is companies where, new data matters significantly more than older data.   For example, if you apply machine learning to some kind of fashion driven market, where maybe your models from a year ago, and the data they were trained on, is useless then that kind of company could be a sinkhole.  The best companies will show some kind of network effects to data, similar to what social networking companies showed with users.  The more data they have, the better they perform.  They have the best performance so they get the most users which in turn gives them even more data faster than their competitors.

The other key set of metrics will be around how much human intervention is needed.  Depending on how ambitious the product is, there may be a little human touch, or a lot of human touch, required to make things work correctly.  Most companies will probably be able to get to a mix of 70% machine and 30% human pretty quickly, unless it is a really really ambitious project that takes longer to get the machines up to speed.  But companies will vary tremendously in how expensive and difficult it is to improve that last 30%.  It will depend a lot on the type of data and the underlying use case, but in general, using humans should help build data sets and at least keep improving the machines.  I think of this as a flywheel, and it may be the most economically important part of companies of this type.

The flywheel is the thing that, once it gets going, it keeps going.  Once the machines get good enough, they just do their thing, with less and less human intervention.  The key question for this class of startups will be - what does it cost to jumpstart the flywheel?  The way many SaaS companies were told not to aggressively scale until $200K MRR or so, I think many machine learning companies will be told not to aggressively scale until the flywheel metrics look good.  But what is "good"?

For most flywheels, we will want to measure percent of tasks solved by machines versus percent solve by humans.  Maybe the machine/man performance ratio.  We will also want to know absolute numbers, and how they are changing.  For example, if machines solve 85% of tasks and humans solve 15%, how is the total pie growing and what does that mean for our human labor?  (The machine part will scale much more cheaply and is unlikely to be a major cost factor)  Depending on the complexity of the task, 15% could mean you still need a lot of humans, or it could mean you need very few.  

Another metric I expect to see is Human Performance Equivalent.  This would be the price to do X task rather than have a human do it.  If the task is to classify something in a picture, and humans do that task for $.10/picture, then how many pictures can the machine classify for $.10?  If it can do 5, then the Human Performance Equivalent is 5 people.  This is different than the previous metric, which looked at what the machine couldn't solve.  This metric looks at the cost of the machine solution vs. a human solution.

The last flywheel metric I expect to see is something like Human Contribution to Machine Improvement.  So, as humans intervene for certain tasks, the machine should be able to watch what the humans do, and improve.  How fast does this happen?  Does a human have to solve something once for the machine to learn it, or more like a few dozen times?  It will heavily depend on the tasks, models, and data for each company.

So, the metrics of a company's flywheel will ultimately dictate how much capital is required to get to the point first where humans no longer scale 1:1 with the customer base, and also when they get to the point that no more humans are needed because the machine is improving fast enough that the existing humans can continually handle all the outliers.  That's when the flywheel is in full force, and that is when machine learning companies will realize their true value.

Over the next five years, expect machine learning startups to talk about data acquisition costs and man/machine economics of the flywheel just the way the SaaS entrepreneur standardized on a vocabulary several years ago. 

My thoughts on this aren't yet full formed, but I wanted to get these ideas out of my head for feedback.  Plus, writing forces me to think through them and make sure they are coherent.  If you have any thoughts or comments, please leave them here, or email me.  As I finalize my thoughts on the unique characteristics of machine learning revenue models, I'll share those as well.