How to Measure Anything: Finding the Value of Intangibles in Business
By Douglas W Hubbard (2014)
I have heard managers say that since each new product is unique, they cannot extrapolate from historical data… therefore, they have to rely on their experience. Note that this is said with no hint of irony.
Define the decision which measurements aim to influence.
- Don’t measure if it’s not going to affect a decision (e.g., dashboards)
- What’s the trigger? E.g., if x is > n, then we take action y
- The wrong decision should have negative consequences
Value of information.
- Expected value of information, etc to determine worth of further investigation
- The less we know, the more valuable information is
When we decompose a decision this way we get new insights. First, you find that there are several other important variables that pertain to the judgment. You might find that there are a lot of other things to measure besides what you first thought you needed to measure, and that one of these new variables is the most important measurement of all… Second, it turns out that merely decomposing highly uncertain estimates provides a huge improvement to estimates.
Examples of low-value measurements:
- Time spent in an activity
- Attendance to sales training
- Near-term costs of a project
- Number of violations found in safety inspections
Examples of high-value measurements:
- Value of an activity
- Effect of sales training on sales
- Long-term benefits of project
- Reduction in risk of catastrophic accidents
Confidence intervals to express uncertainty: e.g., 80-85% (low uncertainty), 50-80% (higher).
Monte Carlo simulations for calculations with confidence intervals.
Our intuitions about sampling are way off.
Rule of five. Poll a sample of five, and the median of the complete population will be in the range of the five values with 93.75% probability (no matter the population size).
Single sample majority rule, with values being equally likely (uniform distribution),
If you randomly select one sample out of a large population, even a population that numbers the thousands or millions, where you initially believed the population proportion can be anything between 0% and 100%, there is a 75% chance that the characteristic you observe in that sample is the same as the majority.
Types of measurement (Stanley Smith Stevens, psychologist),
- Nominal: something is in a set
- Ordinal: something is more or less than something else, but by how much is unknown (e.g., movie ratings)
- Interval: we know by how much but zero is arbitrary (e.g., Celsius)
- Ratio: zero is not arbitrary, it is nil (e.g., Kelvin, money)
The more homogeneous the population, the fewer samples needed.
- Pretend to bet money—would you take a bet on your 90% estimate, or a 90% chance of winning on a spin? E.g., you win money 90% of the time for each bet
- Assume your estimate is wrong and explain why
- Look at the upper and lower bounds separately
- Start with a wildly accurate range and begin eliminating ridiculous values (avoid anchoring)
Examples where statistical prediction was shown to outperform experts (Paul Meehl and Robyn Dawes),
- College freshman GPAs
- Medical student performance
- Navy recruits’s bootcamp performance
From worst to best:
- Unstructured information, subjective estimation process
- Subjective weighted scores on arbitrary scales with no standardization at all—may not be an improvement at all; adds other errors (no improvement)
- Structured, consistently represented information, informal assessment—at least removes some error due to inconsistently presented information (some improvement)
- Simple linear model with standardised z-scores—slightly better at aggregating multiple factors than unaided judges (some improvement again)
- Lens Model or Rasch Model (both big improvements)
- Lens: removes inconsistency for a judge and bias due to unrelated factors
- Rasch: standardizes results of different judges, different tests, and different situations
- Objective model, if you can get the historical data (big improvement)