National Well-Being Indexes and Goodhart’s Law
Dan Benjamin, Kristen Cooper, Ori Heffetz and I were invited to write a response to “A happy choice: wellbeing as the goal of government, by Paul Frijters, Andrew E. Clark, Christian Krekel and Richard Layard. Our title is “Self-Reported Wellbeing Indicators Are a Valuable Complement to Traditional Economic Indicators but Aren’t Yet Ready to Compete With Them.” Our abstract gives our basic reaction:
We join the call for governments to routinely collect survey-based measures of self-reported wellbeing and for researchers to study them. We list a number of challenges that have to be overcome in order for these measures to eventually achieve a status competitive with traditional economic indicators. We discuss in more detail one of the challenges, comprehensiveness: single-question wellbeing measures do not seem to fully capture what people care about. We briefly review the existing evidence, suggesting that survey respondents, when asked to make real or hypothetical tradeoffs, would not always choose to maximize their predicted response to single-question wellbeing measures. The deviations appear systematic, and persist under conditions where alternative explanations are less plausible. We also review an approach for combining single-question measures into a more comprehensive wellbeing index — an approach that itself is not free of ongoing theoretical and implementational challenges, but that we view as a promising direction.
Of things that didn’t make it into the final version of our paper, my favorite was the bit about Goodhart’s Law as applied to measures of national well-being. The current incarnation of the Wikipedia article on “Goodhart’s Law” introduces Goodhart’s Law as follows:
Goodhart's law is an adage named after economist Charles Goodhart, which has been phrased by Marilyn Strathern as "When a measure becomes a target, it ceases to be a good measure."
There is more than one mechanism through which a target can cease to be a good measure. Understanding different mechanisms behind the Goodhart’s Law tendency is useful. The second paragraph in the quotation below is the bit about Goodhart’s law. To give context, I include the first paragraph below and the section heading. (The first paragraph below survived in the final paper mostly unchanged.)
Some risks of relying on single-question wellbeing indicators
One might conjecture that since responses to happiness or life satisfaction questions plausibly capture more of what people care about (despite not being fully comprehensive), they at least improve on traditional economic indicators. Whether this conjecture is true depends on whether people’s responses to these survey questions accurately weight the dimensions of wellbeing relative to each other. This remains an open question, but some of the evidence is negative. For example, in our hypothetical-choice paper mentioned above (Benjamin, Heffetz, Kimball, and Rees-Jones, 2012), we find that people are more likely to choose options involving higher income than to think that these options will increase their life satisfaction or happiness—suggesting that the survey measures underweight consumption of market goods relative to other dimensions of wellbeing. In other words, while traditional wellbeing indicators likely overvalue income and consumption, single-question well-being indicators seem to swing too far in the opposite direction, undervaluing them.
Even if a single-question wellbeing indicator currently serves as a good proxy for overall wellbeing, if it misweights the dimensions of wellbeing relative to people’s own preferences then it will cease to be a good proxy once adopted as a policy target. The general principle has been expressed as “Goodhart’s Law” (Goodhart, 1975): when a measure that historically has been a good proxy for the underlying objective becomes a target, its properties change so much that it ceases to be a good proxy for the underlying objective (see also Holmstrom and Milgrom, 1991). For example, if people’s life-satisfaction ratings underweight (relative to people’s true preferences) the future, the welfare of people from other countries, or the welfare from market activity, then the government may spend too little on education, on foreign aid, or on things like basic income support—making life satisfaction, over time, a worse measure of preferences. In such scenarios, even if a government is well-intentioned, budget constraints mean that a policy goal of maximizing life satisfaction will, in effect, be searching for things that are likely to improve life satisfaction relative to other dimensions of well-being. To guarantee that there is no way for a single-question indicator to backfire as a policy target, one must be sure there is no way for that measure to go up while overall wellbeing goes down.