Lean Throughput and Efficiency KPIs

tl;dr | Throughput and efficiency are qualities you must keep an eye on in a small tech startup. Through KPIs we can understand these qualities and drive them forward. I propose we use the product roadmap as the basis for these KPIs, and I show how we're doing this at engageSPARK. Oh, and there may be cookies.


How do you get everyone on the engineering team focused on the company's success? As engageSPARK is growing, getting investment and zooming in on the sweet spot of market fit, I think a lot about this question. With investment come investors, and them, too, I want to help understand how we're doing. There are different ways to approach this problem, but one tool sticks out in my mind: KPIs. In this article, I explain why and how we at engageSPARK measure throughput and efficiency, and how we're not measuring them, and why.

About KPIs

Key Performance Indicators (KPIs) help a startup succeed. They do that by informing us how well we're doing. And when we know where to improve—and if we're measured by it, too—then we tend to do just that. Everyone wants to succeed.

Let's look at grandma's cookies, as an example. My grandma made fantastic cookies, but, of course, some were better than others. With some measure of pride, she'd fake indignation, exclaiming: Meine Güte, alle weg! Goodness, all gone! The speed at which her cookies disappeared was her measure of success. And so, my grandma made fantastic cookies. (Honestly, I will code for Nussecken.)

measure the wrong thing | Unfortunately, we often end up measuring the wrong thing, something that's unrelated to our success. And we'll “improve” in that wrong direction, too, and wasting time and energy along the way.

Imagine my grandma had measured and optimized for copious amounts of sugar. There probably is some correlation between the sweetness of sugar and children's happiness, so by and large it might have worked. But of course, she'd have missed out on all the cookies with little or no sugar, and poisened us unnecessarily. Getting right what to measure, is half the job.

Why would we even measure the wrong thing? I believe the answer lies mostly within human nature: We go the easy way. We prefer to measure what's easy to measure and easy to understand. This is especially true, if the right answer is not obvious. Another fallacy in my mind stems from our industries current focus on data. More below.

lean | Last thought on KPIs in general: It's easy to measure too much. My grandma could have kept track of average weight of the cookies, median height, time in the oven. For a big corp it may be okay to measure more rather than less, and of course it also helps the economy by keeping entire departments employed, but for my grandma it would have been unnecessary.

For a startup waste is dangerous, as it's quickly a competitive disadvantage. The time spent measuring could be used to move forward. More KPIs also mean more definitions of success, and that's distracting. So, restrict yourself to the minimum amount of KPIs that will nudge you in the right direction.

Alright, KPIs are great, it's good to measure the right ones, it's bad to measure too much. Enough about them and my grandma: What can we measure in the engineering part of a tech startup?

Tech KPIs

A tech startup is all about learning its way to success. That means that its KPIs should answer the question: Does engineering help the company learn about its business fast?

Fast comes in two flavors: We want a lot and in a short amount of time, or in other words: throughput and efficiency. Throughput helps us because we can try out many ideas, look from various angles at a problem. Efficiency gives us feedback quickly, if any of our ideas were worth anything. Together, they help us find market fit fast, if there is any to be found.

So, what do we measure throughput and efficiency with?

easy to measure | We often look at metrics that are easy to measure. How many story points accomplished this week? How many PRs opened, commits pushed? Those are great metrics to have (and gain in importance as the company grows, see below) but they're not key, as in Key Performance Indicator.

Why not? Because they're too far removed from the value engineering is supposed to provide: learning. Commits, PRs, story points focus on individual developer contributions. Those are not key to learning.

control | Of course, there's a logic behind measuring individual contributions: ground efficiency. If we make every individual fast, then naturally they deliver many things in a short time. Such metrics also have the nice side-effect of the company being able to notice when the performance of the team or individuals degrades.

There are two problems with this thinking: First, we need to consider what we want to optimize for. Do you really want your team to focus on producing as many commits as possible? Does a spike in story points make you learn anything? For learning, commits and story points don't capture the granularity that counts. It's too easy to maximize story points, and deliver and learn little.

Second, if you need these metrics at an early stage to tell if someone is slacking, you're too far removed from your team. Yes, its more objective to have numbers—but really, they should tell you nothing you don't know already. Don't waste time.

when do startups learn? | To understand how to measure throughput and performance, it's important to first understand when we learn. I think, we learn something when a user has the opportunity to notice a value that we offer.

For example, we have a potential new feature and to gauge interest, we put up a notification form for interested users to be contacted when the feature is ready. When a person decides to ignore the form or to fill it out, we learn. Note this is different from ordinary product development: We also learn, if the user fills the form with a name instead of the email address: namely about our UX skills. But that's irrelevant for the overall goal to learn about our business.

This is a MVP-level example, but it goes for big features, too: If we think replacing our reporting module by something way faster will make customers happier, how far do we need to go to learn about that assumption? Is it enough to implement the data gathering in the more efficient data model? To gather a story point to implement an animated button? Nope. We need to get it usable by customers.

In the end, all this is about granularity. Someone discusses how to find market fit, and how to evolve the product to do that. Those people build a product roadmap to capture this evolution. That is the level we should be measuring our throughput and efficiency at.

Throughput is then measured as the number of roadmap items that got done. And efficiency is the time it took to deliver them to customers.

Note that roadmap items can be big. They're the “epics” in some product owner's manuals. They may easily span multiple sprints and even take months to complete. By that standard, we cannot ensure that we have any measurable throughput in any single sprint. And that's okay, because whatever we could break that roadmap item down to, is not worth being measured by. It's not key.

A thought on efficiency: When does an item begin and, more importantly, when is it done? I think by and large in our industry we've got the right idea by now: “done” means deployed. (“When a user has the chance…”)

what engageSPARK measures | All this being said and done, how does this look like for engageSPARK? What do we measure in terms of throughput and efficiency? Here are our two KPIs, captured weekly:

  • Throughput: Number of completed roadmap items over the past four weeks (CRI): At the moment, our roadmap goes by quarter. So, we think we can expect a fair amount to be done within four weeks. Because roadmap items can take a long time, we use a sliding window of four weeks to smooth out any spikes in either direction that must occur naturally. That means, on September 15, we look at how many roadmap items have been done between about August 15 to September 15. The size of this window, the number of weeks, is arbitrary. My thinking went like this: It needs be more than a week, probably starting to make sense at three, and four resembles a month and people understand a month: Four it is.
  • Efficiency: Median Time to Customer (TTC): For those completed roadmap items, how many days did it take from the moment a member of the engineering team started working on an item until it was in production? I chose the median because of the variety in sizes for roadmap items. Few items on the roadmap are larger than life, and I'm not so interested in how long they take. The median gives me enough information on “average” items in both absolute terms and in terms of changes over time. However, it will prove educational and help us optimize our processes by discussing the outliers.

red/yellow/green | KPIs indicate performance. We try to make our performance clear by attaching traffic-light colors to them, clearly signaling if we're doing well. The thresholds, when red becomes yellow and then green, I'm still not sure about. I think they'll change for a while still before stabilizing. For throughput, we currently go by percentage: less than 10% of the total number of roadmap items is marked as red, less than 30% as yellow, anything above is green. We'll iterate this.

scaling up | As we scale engageSPARK up, the number of metrics we track and measure ourselves by will expand. In particular, I expect one more KPI to be added for efficiency to track the outliers mentioned: The standard deviation or a similar tool of the time to customer. At the moment, completed roadmap items are relatively few and far between, rendering this metric unstable and “flaky”. But as new colleagues join and become productive, I expect this metric to become a useful KPI.

Also, we'll start measuring commits, PRs and story points. But, but, but, Murat, that's the opposite of what you just said!? Worry not, dear reader, I'm sticking by my opinion (for now). The key indicators will stay, but supporting metrics will be added. What for?

Part of building a dev machine is establishing a habit of constantly optimizing its processes. It's like oiling a physical engine. Where do we loose energy to friction? What is the bottleneck we should address next? The answers are staring me in the face right now; intuition suffices. As we grow, this will be less true and I expect we'll increasingly need data.

Also, as we build out teams and add colleagues, it will be harder for me to keep my ears on the ground. I'll loose touch eventually. Data then will enable me to compare teams' performance over time and relative to each other and push for improvements by asking the right questions.

So, these supporting metrics will help us become faster, but what we measure our overall contribution by, the KPIs, will remain stable.

yey! and meh | All this is not The Truth. These are my thoughts at the moment. In six month I will tell you the opposite. So take it with a grain of Fleur de Sel. Here some ups and downs to ponder:

Yey!

  • Two KPIs only. Less is more.
  • Little overhead to track and measure. (Items on the product roadmap are coarse.)
  • Focuses everyone in engineering on delivering important things.
  • Encourages ship-it mentality, in contrast to commit-it or push-it-to-QA mentality.

Meh

  • Coarseness makes trend-analysis harder. Small changes are meaningless to look at. (Well, are they ever worth it?)
  • Doing all the little stamina things that keep a tech project running gets little appreciation from those KPIs. Need to think about how to counter that.
  • I'll have another opinion in six months. ;)

See you in six months. Eating cookies in the meantime. Not eating them fast, though: They're not as good as grandma's. Maybe wrong KPIs.