Schrödinger’s cat and software productivity

A CEO wants to understand the value she is getting from the team whose salaries she is paying. A VP wishes to reward based on how individuals in his team are performing.

In some groups such as Sales, this seems so straightforward. Yet like Schrödinger’s cat, software development teams can seem to be both productive and non-productive, according to how they are observed.

Attempts to measure developer performance seem inevitably to lead to conflict. Deming summed this up well in 2013.

The “merit rating” nourishes short-term performance, annihilates long-term planning, builds fear, demolishes teamwork, [and] nourishes rivalry and politics. It leaves people bitter and crushed.

W.E. Deming – “The Essential Deming” (2013)

Why is productivity such a challenging concept? And why specifically in software development? In this article I will cover three problem areas which make this such a difficult area, and suggest how we as Agile leaders might approach this.

  1. Local optimisation – productivity is a function of a value stream- a series of activities – not of an individual. Attempting to measure productivity at an individual (or even team) level is flawed.
  2. The observer effect – like Schrödinger’s cat we have a paradox of observation. We cannot understand productivity without measurement. But measurement risks changing the state of the system.
  3. Organisational language – terms like “productivity” and “performance” shift meaning as they are applied to different groups. We cannot “increase productivity” without agreeing what these mean.

Let’s look at each of these in turn.

Local optimisation

We can represent an organisational view of software product development using the Shewhart Cycle, popularised by Deming.

  • Plan – we create hypotheses for what will deliver value to customers. We could call these items “Features”.
  • Do – we create and deploy a set of deliverable outputs based on our hypotheses. This would include all the outputs the customer needs and values (code, documentation etc). It would also include all the outputs we need for our process (tests, CI systems etc).
  • Study (or Check) – we look how the customer uses the new feature, the market and customer needs, and how well the product satisfies the needs.
  • Act – update our prioritisation and strategy to allow us to identify new areas where we could deliver customer value.

If this cycle represents our product development flow, then “productivity” is how effectively we execute this cycle. Moving around the cycle clearly incurs some cost to the organisation. It also (we intend) delivers some value to customers (and hence to the organisation). The balance of the two is productivity – conventionally represented as value divided by cost.

Productivity = Value / Cost

As a value stream, value generation crosses departmental boundaries involving the whole organisation. Indeed it must involve the whole organisation, for if we can create value without part of the organisation, why is that function needed?

  • Plan is primarily led by Product Management, building and prioritising a roadmap and backlog.
  • Do is largely an Engineering activity. In many organisations it may even fragment further into Platform, UI, Documentation and Deployment.
  • Study is the responsibility of Product Management, Sales and Customer Relationship Managers, understanding the customer need.
  • Act is across Product Management, Sales and Business Development, agreeing the business opportunities we choose to follow.

We can measure productivity around the whole cycle of incurring cost and generating value. Any step performed well will increase productivity, while performing badly will decrease it.

Let us take an example.

  • Imagine we are poor at listening to our customers (the “Study” stage).
  • As a result of our lack of knowledge, we make poor strategic decisions (the “Act” stage).
  • Our features (the “Plan” stage) will not satisfy customer needs because of our strategy.
  • The outputs we create (at the “Do” stage) are low value because we build the wrong features.

Our product development cycle now has low productivity (as the value of what we create is low). To solve the problem we must look at the end-to-end system and the flow of value. Local optimisation, as Lean emphasises, will not solve the problem.

The capacity of the plant is equal to the capacity of its bottlenecks

Eliyahu M. Goldratt, “The Goal”

The problem with most productivity measures is that they are applied at a team or group level. Engineering measures only measure the Engineering activities, which are largely in the “Do” category. As Goldratt notes above, if these are not the bottleneck, improving measures here will give minimal benefit. However much we optimise “Do”-ing in the above scenario, perhaps improving DORA metrics such as faster deployment frequency or less change failures, we will not substantially increase productivity. We are solving the wrong problem.

Speed is irrelevant if you are going in the wrong direction.

Mahatma Gandhi

The Observer Effect

We are trained to believe that we can observe without impact, but this is not automatic. As leaders, we must always consider the effect of our well-intentioned observation.

The observer effect is when the act of observation disturbs the observed system. We therefore cannot discover the system state. It is most notable in quantum mechanics, where the state of a system may be not just unknown, but unknowable (or undefined) until it is observed.

“Schrödinger’s cat” is the most famous example, where a cat in a box is visualised as both alive and dead until the act of opening the box forces the resolution into one specific state.

What we observe is not nature itself but nature exposed to our method of questioning.

Werner Heisenberg

Organisations often want to pursue “performance management”. This traditional management approach measures “performance” and builds reward (and penalty) structures based on those measurements. Like much of Scientific Management, it is based on two principles.

  • The first is a “Theory X” belief that motivation is extrinsic – individuals will underperform unless they receive rewards and punishments which are created by managers.

Hardly a competent workman can be found who does not devote a considerable amount of time to studying just how slowly he can work

F.W. Taylor
  • The second principle is a reductionist one that the best way to perform any work can be determined analytically and an individual can be measured against that standard.

It is only through enforced standardization of methods … that this faster work can be assured.

F.W. Taylor

The idea of “performance management” is that we can measure against “ideal performance” and that by controlling this measure we will be improving productivity. It is often suggested that “performance” and “productivity” are the same, and easy to measure and control. If our metric increases, it is suggested, our productivity must increase.

Assessing contributions by individuals to a team’s backlog (starting with data from backlog management tools such as Jira) … can enable team leaders to manage clear expectations for output.

Yes, you can measure software developer productivity” – McKinsey

However, by the Observer Effect, the metrics we choose will modify the behaviour of individuals and teams. Despite how the system is functioning, measurement of “performance” and building reward structures based on those measurements will necessarily affect the activities of the team or individual.

The “performance management” approach assumes robust measures for “performance” which correlate tightly and predictably to productivity. Therefore individuals who score highly on the measure necessarily deliver the most value.

I believe this idea is deeply flawed in software development. The domain is complex, not reductionist. Self-managing teams outperform micromanaged ones. And most developers have a high level of intrinsic motivation.

Organisational Language

A central theme that I return to across organisations is organisational language. And I don’t just mean multi-national organisations, which add an extra layer of linguistic and cultural complexity. Different people with different backgrounds use the same words in different ways.

  • Your “test” may mean functional correctness against specification, where mine may be usability by the customer.
  • Your “done” may be running without errors on a local machine and mine may be returning data from a live system.
  • Your “minor bug” may need an immediate patch release, while mine might relate to button alignment.

To understand productivity we need to agree across the organisation what this means in the specific domain of software development. This is part of a wider challenge of integrating software development into the wider organisation. Concepts specific to software development (“branches”, “technical debt”, “refactoring” to name but a few) are not widely understood (or explained) outside the domain.

As an example, an organisation where I recently worked moved to Semantic Versioning. This introduces a strict technical meaning to compatibility between releases and API versions. This made a lot of sense from the technical side and the resulting control over API changes was a real asset. However, from a marketing side it was a headache, rapidly revealing that terms like “major release” had very different meanings in different parts of the organisation. To the marketing team, “major” was a marketing term, not a technical term.

To developers, productivity often focusses on the speed and cost of code generation. This is why the DORA metrics are so popular. DORA is Google’s DevOps Research and Assessment team who identified some key software metrics for DevOps heavily focussed on the develop/deploy interface.

But the DORA metrics require a strong understanding of software workflow and focus on a small, if important, part of the value stream. Time from commit to deploy may be important. However, little is often done to explain to the wider business why “the efficiency of your delivery pipeline” should be considered “productivity”.

Performance is not (always) productivity

Senior managers often question why it is hard to measure productivity in software development, when it appears to be measured successfully in other parts of the organisation. In particular, there is confusion over why processes for performance management do not ensure productivity in software development.

The problems with “performance management” approaches to control productivity in software development are a mixture of all three of the factors above:

  1. Local optimisation
  2. Organisational Language
  3. Observer Effect.

“Performance” is a poorly defined term without a clear meaning, which is clearly an Organisational Language issue. Performance is not the same as productivity. If there is any definition of performance, it is a comparison of what is created with what is expected:

Performance = output / target

Senior managers often question why it is hard to measure “performance” in software development, when it appears to be measured successfully in other parts of the organisation. For example, in Sales we assign targets for number and value of sales, and at the end of the time period we assess the value achieved against the target, and call this “performance”.

The issue is in the correlation between “performance” and “productivity”. In a sales environment, an individual salesperson is a cost. More revenue from sales is therefore both higher performance (output / target) and higher productivity (value / cost). The two link tightly.

Of course, this simple correlation isn’t perfect. A salesperson can reduce value to the business by cannibalising sales from another salesperson. Or they can increase cost by requiring presales support or selling features which haven’t yet been built. But in general the mapping is close enough for most organisations to be comfortable.

It is not hard to measure “performance” in the same way elsewhere in the business. Software lines of code generated (SLOC) used to be a popular performance measure. But these are measures of volume of output. In a development group, the correlation with productivity is far weaker. Value comes from end to end flow through the value stream not from Local Optimisation and individual behaviour.

Finally, by the Observer Effect, if our reward structure is based on performance, we will boost performance. If performance and productivity are not aligned, we will not boost productivity, and may well reduce it.

Good Practices

If an objective measure of software development productivity is so challenging, what should we as Agile managers focus on?

Let’s look at the three issues which are raised in the article


Local optimisation

We can focus not just on optimising our team, or working effectively as an individual contributor. Instead we can take an end-to-end view and look at how to optimise through the whole value stream. Collaborate between teams, discuss and assess areas of waste – handovers, delays, rework etc and target the improvements there.

Observer effect

Creating high-profile metrics and measuring people by them will have a strong effect on behaviour. This is rarely beneficial. Avoid over-focussing on individual metrics. Instead, take a broad spread of measures and use these to assess bottlenecks and areas for improvement, and to propose experiments for improvement.

Organisational language

Development teams suffer when they are not well integrated into the organisation. Make an effort to build wide understanding around the organisation. Avoid using deeply technical and targeted measures outside the group. Instead focus on business language and ensuring that the value of development teams is understood.


We should abandon the Taylorist idea that developer productivity is a “solvable” problem. Instead we need to view it as a “complex” problem which needs to be addressed at an organisational level, not an individual one.

While some parts of the business such as Sales can effectively measure individual output as a proxy for productivity, this works poorly in development teams. My own preference is to focus on improvement activity and reporting reduction of waste. This probably has greater value and meaning than the attempts to report on increase in productivity.

Leave a Reply

Your email address will not be published. Required fields are marked *

Discover more from Agile Plays

Subscribe now to keep reading and get access to the full archive.

Continue reading