Statistically Controlling for Confounding Constructs is Harder than You Think—Jacob Westfall and Tal Yarkoni

Last week, I posted “Adding a Variable Measured with Error to a Regression Only Partially Controls for that Variable.” Today, to reinforce that message, I’ll discuss the PlosOne paper “Statistically Controlling for Confounding Constructs is Harder than You Think” (ungated), by Jacob Westfall and Tal Yarkoni. All the quotations in this post come from that article. (Both Paige Harden and Tal Yarkoni himself pointed me to this article.)

A key bit of background is that social scientists are often interested not in just prediction, but in understanding. Jacob and Tal write:

To most social scientists, observed variables are essentially just stand-ins for theoretical constructs of interest. The former are only useful to the extent that they accurately measure the latter. Accordingly, it may seem natural to assume that any statistical inferences one can draw at the observed variable level automatically generalize to the latent construct level as well. The present results demonstrate that, for a very common class of incremental validity arguments, such a strategy runs a high risk of failure.

What is “incremental validity”? Jacob and Tal explain:

When a predictor variable in a multiple regression has a coefficient that differs significantly from zero, researchers typically conclude that the variable makes a “unique” contribution to the outcome.

“Latent variables” are the underlying concepts or “constructs” that social scientists are really interested in. This passage distinguishes latent variables from the “proxies” actually in a data set:

And because measured variables are typically viewed as proxies for latent constructs of substantive interest … it is natural to generalize the operational conclusion to the latent variable level; that is, to conclude that the latent construct measured by a given predictor variable itself has incremental validity in predicting the outcome, over and above other latent constructs that were examined.

However, this is wrong, for the reason stated in the title of my post: “Adding a Variable Measured with Error to a Regression Only Partially Controls for that Variable.” Here, it is crucial to realize that any difference between the variable actually available in a data set and the underlying concept it is meant to proxy for counts as “measurement error.”

How bad is the problem?

The scope of the problem is considerable: literally hundreds of thousands of studies spanning numerous fields of science have historically relied on measurement-level incremental validity arguments to support strong conclusions about the relationships between theoretical constructs. The present findings inform and contribute to this literature—and to the general practice of “controlling for” potential confounds using multiple regression—in a number of ways.

Unless a measurement error model is used, or a concept is measured exactly, the words “controlling for” and “adjusting for” are red flags for problems:

… commonly … incremental validity claims are implicit—as when researchers claim that they have statistically “controlled” or “adjusted” for putative confounds—a practice that is exceedingly common in fields ranging from epidemiology to econometrics to behavioral neuroscience (a Google Scholar search for “after controlling for” and “after adjusting for” produces over 300,000 hits in each case). The sheer ubiquity of such appeals might well give one the impression that such claims are unobjectionable, and if anything, represent a foundational tool for drawing meaningful scientific inferences.

Unfortunately, incremental validity claims can be deeply problematic. As we demonstrate below, even small amounts of error in measured predictor variables can result in extremely poorly calibrated Type 1 error probabilities.

… many, and perhaps most, incremental validity claims put forward in the social sciences to date have not been adequately supported by empirical evidence, and run a high risk of spuriousness.

The bigger the sample size, the more confidently researchers will assert things that are wrong:

We demonstrate that the likelihood of spurious inference is surprisingly high under real-world conditions, and often varies in counterintuitive ways across the parameter space. For example, we show that, because measurement error interacts in an insidious way with sample size, the probability of incorrectly rejecting the null and concluding that a particular construct contributes incrementally to an outcome quickly approaches 100% as the size of a study grows.

The fundamental problem is that the imperfection in variables actually in data sets as proxies for the concepts of interest doesn’t make it harder to know what is going on, it biases results. If researchers treat proxies as if they were the real thing, there is trouble:

In all of these cases—and thousands of others—the claims in question may seem unobjectionable at face value. After all, in any given analysis, there is a simple fact of the matter as to whether or not the unique contribution of one or more variables in a regression is statistically significant when controlling for other variables; what room is there for inferential error? Trouble arises, however, when researchers behave as if statistical conclusions obtained at the level of observed measures can be automatically generalized to the level of latent constructs [9,21]—a near-ubiquitous move, given that most scientists are not interested in prediction purely for prediction’s sake, and typically choose their measures precisely so as to stand in for latent constructs of interest.

Jacob and Tal have a useful section in their paper on statistical approaches that can deal with measurement error under assumptions that, while perhaps not always holding, are a whole lot better than the assumption than assuming a concept is measured precisely by the proxy in the data for that concept. They also make the point that, after correctly accounting for measurement error—including any differences between what is in the data and the underlying concept of interest—often there is not enough statistical power in the data to say much of anything. That is life. Researchers should be engaging in collaborations to get large data sets that—properly analyzed with measurement error models—can really tell us what is going on in the world, rather than using data sets they can put together on their own that are too small to reliably tell what is going on. (See “Let's Set Half a Percent as the Standard for Statistical Significance.” Note also that preregistration is one way to make results at a less strict level of statistical significance worth taking seriously.) On that, I like the image at the top of Chris Chambers’s Twitter feed:

whats best for science.png

Dan Benjamin, my coauthor on many papers, and a strong advocate for rigorous statistical practices that can really help us figure out how the world works, suggested the following two articles as also relevant in this context:

How Weight Loss Happens: Mass In/Mass Out Revisited

Figure from the article above.

Figure from the article above.

My personal knowledge of Weight Watchers is limited to a few scenes in “Mad Men,” when Betty Draper goes to Weight Watchers around 1970. But I am struck by the folly of participants feeling overjoyed that they had a good week if their weight went down half a pound since the previous weigh-in and feeling crushed if their weight went up half a pound since the previous weigh-in. In important measure out of morbid scientific curiosity, I weigh myself every day when I take a shower after work or after my daily walk on weekends or days I am working at home. The vagaries of my schedule—plus often having a short eating window in a day of as little as four hours—mean that I am sometimes weighing myself before eating, sometimes after, sometimes early in the day and sometimes late in the day. I can tell you that my weight can easily swing six pounds based on these variations. And trying to standardize things wouldn’t eliminate all of the variance.

The bottom line is that there is a huge mass-in/mass-out measurement error in any one weighing. (See “Mass In/Mass Out: A Satire of Calories In/Calories Out.”) In my experience, if I am trying to keep my weight even, it takes at least six weeks of weighing before I am confident whether I am staying even, losing weight or gaining weight in the long-run sense of burning fat or putting on fat on net.

Of course, if I am eating nothing, or almost nothing, in an extended fast, I am confident that I will be burning fat. (See “My Annual Anti-Cancer Fast” and “Increasing Returns to Duration in Fasting.”) During those extended fasts, I have been mystified by where the roughly .6 pounds per day of fat that I anticipate I will be losing. I am mystified no longer: Ruben Meerman and Andrew Brown, in the British Medical Journal article “When somebody loses weight, where does the fat go?” calculate that 10 kilograms of fat, combined with 19 kilograms of oxygen, departs the body as 19.6 kilograms of carbon dioxide exhaled and 9.4 kilograms of water. So losing weight by burning fat doesn’t require much in the way of excretion of solids; gases and liquids excreted pretty much do the job.

In “Mass In/Mass Out: A Satire of Calories In/Calories Out,” I make the point that mass in/mass out is every bit as much of a valid identity as calories in/calories out. But an identity does not make a complete theory of weight loss. One must also know how each term in an identity is regulated or otherwise determined. I maintain that in your natural environment, what you eat, and the schedule on which you eat it, will have a dramatic influence on the number of calories you end up consuming and the number of calories you end up burning. In particular, even a modest amount of sugar in your diet can make it very, very, very difficult to lose weight (or to keep from regaining weight that you have already lost), and eating all the time from when you wake to when you go to sleep can make it very difficult to lose weight or keep weight off, even if you have tiny meals and tiny snacks. You can easily experiment with this on yourself. For most people, the long-run expenditure of willpower needed to go off sugar and stay off sugar is a lot less than trying to cut back on calories in the long run while still eating substantial amounts of sugar. (See “Letting Go of Sugar” and “Live Your Life So You Don't Need Much Self-Control.”)

On the overall logic behind my views on weight loss, see “4 Propositions on Weight Loss” and for good ways to get started with serious weight loss, see “3 Achievable Resolutions for Weight Loss.”

I am eager to hear reports of people who try the approach I recommend and find it doesn’t work as well as people who try the approach I recommend and find that it doesn’t work. You can always tweet to me at

For annotated links to other posts on diet and health, see:

Exoplanets and Faith

I am pleased to see half of the Nobel Prize in Physics this year go to the first confirmed discovery of a planet orbiting a star like our sun. Since then, evidence for thousands of planets circling other stars has been gathered, including a kind of census conducted by the Kepler orbiting telescope, from which scientists drew this estimate:

There is at least one planet on average per star.[See abstract below.] About 1 in 5 Sun-like stars[a] have an "Earth-sized"[b] planet in the habitable zone.

I have had a longstanding interest in discoveries of planets around other stars. What I remember is how many false starts there were and the period when some scientists said that the lack of confirmed discoveries of planets around other stars meant that there might not be any. In hindsight, excessive optimism about the accuracy of detection methods led to a period of excessive pessimism about the existence of exoplanets.

To me, then, the eventual confirmed discoveries of exoplanets were a triumph of faith over doubt. By faith I simply mean a belief that influences action that, at the time, is based on inadequate evidence. In this sense, we all have to make decisions based on faith very frequently. I emphasize this point in my post “The Unavoidability of Faith.”

I’ll save any discussion of other intelligent life in the universe for another post, but I want to point out something very interesting about exoplanets from the standpoint of popular culture: being literally light-years away, sending probes to exoplanets is dauntingly difficult and might require not only key technological advances, but also enormous patience. But imaging exoplanets, while quite difficult, is something we can hope to do even in my lifetime, let alone in the lifetime of those who are now young graduate students. There is now a growing list of exoplanets that have officially agreed-upon proper names; there is hope that some exoplanets will become familiar to even elementary school students, as the list of their known properties grows.

It is hard to keep up with the onrushing discoveries about exoplanets, but I hope someone will put together a high-quality children’s book on exoplanets that reflects at least everything we know today. Both exoplanets themselves and their discovery are inspiring to me, and I think would be inspiring to many youngsters.

Adding a Variable Measured with Error to a Regression Only Partially Controls for that Variable

The Partitioned-Matrix Inversion Formula.    This image first appeared in the the post “   The Partitioned Matrix Inversion Formula   .”   Image created by Miles Spencer Kimball. I hereby give permission to use this image for anything whatsoever, as long as that use includes a link to this blog. For example, t-shirts with this picture (among other things) and on them would be great! :)    Here is a link to the Wikipedia article “Block Matrix,”    which talks about the partitioned matrix inversion formula.

The Partitioned-Matrix Inversion Formula. This image first appeared in the the post “The Partitioned Matrix Inversion Formula.” Image created by Miles Spencer Kimball. I hereby give permission to use this image for anything whatsoever, as long as that use includes a link to this blog. For example, t-shirts with this picture (among other things) and on them would be great! :) Here is a link to the Wikipedia article “Block Matrix,” which talks about the partitioned matrix inversion formula.

In “Eating Highly Processed Food is Correlated with Death” I observe:

In observational studies in epidemiology and the social sciences, variables that authors say have been “controlled for” are typically only partially controlled for. The reason is that almost all variables in epidemiological and social science data are measured with substantial error.

In the comments, someDude asks:

"If the coefficient of interest is knocked down substantially by partial controlling for a variable Z, it would be knocked down a lot more by fully controlling for a variable Z. "

Does this assume that the error is randomly distributed? If the error is biased (i.e. by a third underlying factor), I would think it could be the case that a "fully controlled Z" could either increase or decrease the the change in the coefficient of interest.

This post is meant to give a clear mathematical answer to that question. The answer, which I will back up in the rest of the post, is this:

Compare the coefficient estimates in a large-sample, ordinary-least-squares, multiple regression with (a) an accurately measured statistical control variable, (b) instead only that statistical control variable measured with error and (c) without the statistical control variable at all. Then all coefficient estimates with the statistical control variable measured with error (b) will be a weighted average of (a) the coefficient estimates with that statistical control variable measured accurately and (c) that statistical control variable excluded. The weight showing how far inclusion of the error-ridden statistical control variable moves the results toward what they would be with an accurate measure of that variable is equal to the fraction of signal in (signal + noise), where “signal” is the variance of the accurately measured control variable that is not explained by variables that were already in the regression, and “noise” is the variance of the measurement error.

To show this mathematically, define:

Y: dependent variable

X: vector of right-hand-side variables other than the control variable being added to the regression

Z: scalar control variable, accurately measured

v: scalar noise added to the control variable to get the observed proxy for the control variable. Assumed uncorrelated with X, Y and Z.

Then, as the sample size gets large:

Define the following notation for the part of the variance of Z and of the variance of Z+v that are orthogonal from X (that is, the parts that are unpredictable by X and so represents additional signal from Z that was not already contained in X, plus the variance of noise in the case of Z+v). One can call this “the unique variance of Z”:

I put a reminder of the partitioned matrix inversion formula at the top of this post. Using that formula, and the fact that the unique variance of Z is a scalar, one finds:

Thus, the OLS estimates are given by:

When only a noisy proxy for the statistical control variable is available (which is the situation 95% of the time), the formula becomes:

I claimed at the beginning of this post that the coefficients when using the noisy proxy for the statistical control variable were a weighted average of what one would get using only X on the right-hand side and what one would get using accurately measured data on Z. Note that what one would get using only X on the right-hand side of the equation is exactly what one would get in the limit as the variance of the noise added to Z (which is Var(v)) goes to infinity. So adding a very noisy proxy for Z is almost like leaving Z out of the equation entirely.

The weight can be interpreted from this equation:


As noted at the beginning of the post, the right notion of the signal variance is the unique variance of the accurately measured statistical control variable. The noise variance is exactly what one would expect: the variance of v.

I have established what I claimed at the beginning of the post.

Some readers may feel that the limitation of Z being a single scalar variable is a big limitation. One can generalize the results to more statistical control variables. First, the results apply when adding many statistical control variables or their proxies, one at a time, sequentially. Second, one can show that if the parts of Z1 and Z2 that are orthogonal to X are themselves orthogonal to each other, then the effects of adding Z1 and Z2 are additive. Third, if one has a set of correlated statistical control variables or their proxies that you want to add, one can (A) transform units so the noise variance looks the same for each of these additional statistical control variables or their proxies (sphericalizing the noise), (B) orthogonalize relative to X then (C) find the principal components of the remainder of these statistical control variables or their proxies (which will have the same eigenvectors because of the sphericalization of the noise), then note that the affects of each of the principal components are now additive.

Conclusion: Almost always, one has only a noisy proxy for a statistical control variable. Unless you use a measurement error model with this proxy you will not be controlling for the underlying statistical control variable. You will only be partially controlling for it. Even if you do not have enough information to fully identify the measurement error model you must think about that measurement error model and report a range of possible estimates based on different assumptions about the variance of the noise.

Remember that any departure from the absolutely correct theoretical construct can count as noise. For example, one might think one has a totally accurate measure of income, but income is really acting as a proxy for a broader range of household resources. In that case, income is a noisy proxy for the household resources that were the correct theoretical construct.

I strongly encourage everyone reading this to vigorously criticize any researcher who claims to be statistically controlling for something simply by putting a noisy proxy for that thing in a regression. This is wrong. Anyone doing it should be called out, so that we can get better statistical practice and get scientific activities to better serve our quest for the truth about how the world works.

Here are links to other posts that touch on statistical issues:

The Carbohydrate-Insulin Model Wars

Writing on diet and health, I have been bemused to see the scientific heat that has raged over whether a lowcarb diet leads people to burn more calories, other things equal. It is an interesting question, because it speaks to whether in the energy balance equation

weight gain (in calorie equivalents) = calories in - calories out

the calories out are endogenous to what is eaten rather than simply being determined directly by conscious choices about how much to exercise.

My own view is that, in practice, the endogeneity of calories in to what is eaten is likely to be a much more powerful effect than the endogeneity of calories out to what is eaten. Metabolic ward studies are good at analyzing the endogeneity of calories out, but by their construction, abstract from any endogeneity of calories in that would occur in the wild by tightly controlling exactly what the subjects of the metabolic ward study eat.

The paper flagged at the top by David Ludwig, Paul Lakin, William Wong and Cara Ebeling is the latest salvo in an ongoing debate about a metabolic ward study done by folks associated with David Ludwig (including David himself). Much of the discussion is highly technical and difficult for an outsider to fully understand. But here is what I did manage to glean:

  1. Much of the debate is arising because the sample sizes in this and similar experiments are too small. I feel the studies that have been done so far amply justify funding for larger experiments. I would be glad to give input on my opinions about how such experiments could be tweaked to deliver more powerful and more illuminating results.

  2. One of the biggest technical issues beyond lower-than-optimal power involves how to control statistically for weight changes. Again, it is not so easy to fully understand all the issues with the time it is appropriate for me to devote to a single blog post, but I think weight changes need to be treated as an indicator of amount of fat burned with a large measurement error due to what I have called “mass-in/mass-out” effects. (See “Mass In/Mass Out: A Satire of Calories In/Calories Out.”) Whenever a right-hand-side variable is measured with error relative to the the exactly appropriate theoretical concept, a measurement error model is needed in order to get a consistent statistical estimate of the parameters of interest. I’ll write more (see “Adding a Variable Measured with Error to a Regression Only Partially Controls for that Variable”) about what happens when you try to control for something by using a variable afflicted with measurement error. (In brief, you will only be partially controlling for what you want to control for.)

  3. David Ludwig, Paul Lakin, William Wong and Cara Ebeling are totally correct in specifying what one should focus on as the hypothesis of interest:

Hall et al. set a high bar for the Carbohydrate-Insulin Model by stating that “[p]roponents of low-carbohydrate diets have claimed that such diets result in a substantial increase in … [TEE] amounting to 400–600 kcal/day”. However, the original source for this assertion, Fein and Feinman [18], characterized this estimate as a “hypothesis that would need to be tested” based on extreme assumptions about gluconeogenesis, with the additional qualification that “we [do not] know the magnitude of the effect.” An estimate derived from experimental data—and one that would still hold major implications for obesity treatment if true—is in the range of 200 kcal/day [3]. At the same time, they set a low bar for themselves, citing a 6-day trial [16] (confounded by transient adaptive responses to macronutrient change [3]) and a nonrandomized pilot study [5] (confounded by weight loss [8]) as a basis for questioning DLW methodology. Elsewhere, Hall interpreted these studies as sufficient to “falsify” the Carbohydrate-Insulin Model [19]—but they do nothing of the kind. Indeed, a recent reanalysis of that pilot study suggests an effect similar to ours (≈250 kcal/day) [20].

Translated, this says that a 200-calorie-a-day difference is enough to be interesting. (Technically, the authors say “kilocalories,” but dieters always call kilocalories somewhat inaccurately by the nickname “calories.”) That should be obvious. For many people, 200 calories would be around 10% of the total calories they would consume and expend in a day. If a 200-calorie-a-day difference isn’t obvious beyond statistical noise, a metabolic ward study is definitely underpowered and needs a bigger sample!

Conclusion. In conclusion, let me emphasize again that the big issue with the worst carbs is that they make people hungry again relatively quickly, so that they eat more. (See “Forget Calorie Counting; It's the Insulin Index, Stupid” for which carbs are the worst.) Endogeneity of calories in might be a bigger deal than endogeneity of calories out. Moreover, because it is difficult for the body to switch back and forth between burning carbs and burning fat, a highcarb diet makes it painful to fast, while a lowcarb highfat diet when eating makes it relatively easy to fast. And fasting (substantial periods of time with no food, and only water or unsweetened coffee and tea as drinks) is powerful both for weight loss and many other good health-enhancing effects.

Update: David Ludwig comments on Twitter:

Perhaps: “endogeneity of calories in to what is eaten is likely to be a much more powerful effect than the endogeneity of calories out to what is eaten.” But the latter is a unique effect predicted by CIM. And if CIM is true, both arise from excess calorie storage in fat cells.

For annotated links to other posts on diet and health, see:

Here are some diet and health posts on authors involved in the Carbohydrate-Insulin Model Wars:

John Locke Against Tyranny

The last five chapters of John Locke’s 2d Treatise on Government: Of Civil Government (XV–XIX) are an extended argument that the rule of tyrants is illegitimate and that the people are justified in overthrowing tyrants. The three chapters right before that (XII–XIV) lay out some of the things a ruler can appropriately do, providing a contrast to tyranny. The titles of my blog posts on these chapters provide a good outline of John Locke’s argument here. Take a look.

Chapter XII: Of the Legislative, Executive, and Federative Power of the Commonwealth

Chapter XIII: Of the Subordination of the Powers of the Commonwealth

Chapter XIV: Of Prerogative

Chapter XV: Of Paternal, Political, and Despotical Power, considered together

Chapter XVI: Of Conquest

Chapter XVII: Of Usurpation

Chapter XVIII: Of Tyranny

Chapter XIX: Of the Dissolution of Government

Links to posts on the earlier chapters of John Locke's 2d Treatise can be found here:

Posts on Chapters I–III:  John Locke's State of Nature and State of War 

Posts on Chapters IV–V:  On the Achilles Heel of John Locke's Second Treatise: Slavery and Land Ownership

Posts on Chapters VI–VII : John Locke Against Natural Hierarchy

Posts on Chapters VIII–XI: John Locke's Argument for Limited Government

How Negative Interest Rates Affect the Economy

Recently, I had an email query from a journalist about negative interest rates—asking in particular about how they would affect the economy. In answering, I was mindful of some of the criticisms that have been made of negative interest rates as a policy tool. I thought my readers might be interested in what I wrote, even though it didn’t make it into the newspaper article. Here it is:

Other countries have cut rates to as low as -.75%. From that experience, we know that going to negative rates as low as -.75% works just like any other rate cut in the Fed's target rate. Potential issues such as strains on bank profits or large-scale paper currency storage may arise at rates below -.75%, but not at mild negative rates.

Rate cuts work in every corner of the economy to encourage investment and consumption spending both by shifting the balance of power in favor of those most apt to spend and by giving an incentive to spend. In the case of negative rates, the carrot for those who spend is coupled with a stick for those sitting on pile of cash they resist putting to good use.

Other than banks that worry about things that haven't happened yet anywhere and those who have higher-rates-are-good ideologies or simply don't understand negative rates, complaints about negative interest rates are likely to come from those who don't want to spend.

Feel free to quote this.

My sentence

Rate cuts work in every corner of the economy to encourage investment and consumption spending both by shifting the balance of power in favor of those most apt to spend and by giving an incentive to spend.

is shorthand for what I say in these posts about the transmission mechanism for negative interest rates:

Many of the details I give about the experience with negative interest rates so far are taken from My new IMF Working Paper with Ruchir Agarwal: “Breaking Through the Zero Lower Bound” (pdf) (or on IMF website).

I have an annotated bibliography of what I have written on negative interest rate policy at this link.

The Four Food Groups Revisited

Image created by Miles Spencer Kimball. I hereby give permission to use this image for anything whatsoever, as long as that use includes a link to this blog. In this blog post I question my assertion of half a century ago that what is depicted above makes for a good diet.

Image created by Miles Spencer Kimball. I hereby give permission to use this image for anything whatsoever, as long as that use includes a link to this blog. In this blog post I question my assertion of half a century ago that what is depicted above makes for a good diet.

In elementary school, back in the 1960’s, I drew illustration of what were then called “The Four Food Groups” as a school assignment. Historically, the formulation of the recommendations to eat a substantial amount from each food group each day may have owed as much to agricultural and broader food business lobbying as to nutrition science. But those recommendations were not as much at variance with reasonably informed nutritional views back then as they are to reasonably informed nutritional views now. Let me give you my view on these four food groups.

Milk Group

I consume quite a bit of milk and cheese, but only because I love dairy. I think of milk and cheese as being somewhat unhealthy. There are two issues. One is the issue that animal protein might be especially good fuel for cancer cells. I wrote about that in these posts:

The other is that the majority of milk sold is from cows with a mutation that makes a structurally weak protein from which a truly nasty 7-amino-acid peptide breaks off. Fortunately, that issue can be largely avoided by eating goat and sheep cheese rather than cow cheese and by drinking A2 milk (which I just saw at Costco yesterday; I have seen it for a while at Whole Foods and, in my area, at Safeway). I wrote about that in these posts:

If you do consume milk, I have some advice here to drink whole milk. (100 calories worth of whole milk will be more satiating than 100 calories of skim milk.)

As for cream and butter, since they have relatively little milk protein, and are quite satiating, I think of them as being some of the healthiest dairy products, though their calories do count in these circumstances:

To preview what I will say again below, in bread and butter, it is the bread that is unhealthy, not the butter. And, in the extreme, eating butter straight is a lot better than the many ways we find to almost eat sugar straight. Eating sugar will make you want more and more and more. At least eating butter straight is self-limiting because butter is relatively satiating.

Meat Group

Meat has the same problem milk does: animal protein typically being abundant in amino acids such as glycine that are especially easy for even metabolically damaged cancer cells to burn as fuel.

Also, because of the protein content, many types of meat ramp up insulin somewhat, as you can see from the tables in “Forget Calorie Counting; It's the Insulin Index, Stupid.” David Ludwig points out that meat often also raises glucagon, which is a little like an anti-insulin hormone, but in my own experience, eating beef, for example, tends to leave me somewhat hungry afterwards, which is consistent with the insulin effect being significantly stronger. (Of course, since almost all the meat I eat is at restaurant meals once—or occasionally twice—a week, it might be something else stimulating my insulin than just the meat.)

I do regularly put one egg in “My Giant Salad.” That at least doesn’t ramp up my insulin levels too much. For why that matters, see “Obesity Is Always and Everywhere an Insulin Phenomenon.” However, the reason I don’t put in two eggs is that I am worried about too much animal protein.

Sometimes nuts are included in the meat group. I view true nuts as very healthy—an ideal snack on the go if you are within your eating window. See “Our Delusions about 'Healthy' Snacks—Nuts to That!

I haven’t made up my mind about beans—which are also sometimes included in the meat group. They are often medium high on the insulin index, just as beef is. And there are worries based on Steven Gundry’s hypotheses about we and our microbiome not being fully adapted to new world food. See:

As the Wikipedia article “Beans” currently says:

Most of the kinds commonly eaten fresh or dried, those of the genus Phaseolus, come originally from the Americas, being first seen by a European when Christopher Columbus, during his exploration of what may have been the Bahamas, found them growing in fields.

Fruit and Vegetable Group

Nutritionally, the fruit and vegetable group is really at least five very different types of food:

1. Vegetables with easily digested starches: Think potatoes here. Avoid them like the nutritional plague they are. Easily-digested starches turn into sugar quite readily. Also think of peas—and if you count it as a vegetable rather than as a quasi-grain.

2. Vegetables with resistant starch: A reasonable amount of these is OK. I am thinking of green bananas and sweet potatoes. (Beans I discussed above.)

3. Nonstarchy vegetables: Very healthy. Here is a list of nonstarchy vegetables from Wikipedia:

4. Botanical fruits: Tomatoes and cucumbers, eggplant, squash and zucchini are botanically fruits that we call vegetables for culinary purposes. Many of these botanical fruits that we eat are new world foods that Steven Gundry’s worrisome hypotheses about our inadequate adaptation to New World foods would apply to. So I try to eat these only sparingly. However, as I write in “Reexamining Steve Gundry's `The Plant Paradox, the evidence for tomatoes—though, perhaps strangely to you, more positive for cooked tomatoes than raw tomatoes—is so positive it is probably good to continue eating them freely.

On both identifying botanical fruits and identifying good vegetables with resistant starches, Steven Gundry’s lists of good and bad foods according to his lights (which include other particular slants he has on things as well) are quite helpful. You might want to take a more positive attitude toward botanical fruits than Steven Gundry, but it is good to know which vegetables are really botanical fruits to see if you notice any reaction when you eat them. A lot of the clinical experience on which Steven Gundry bases his advice is experience with patients who have autoimmune problems, so I would advise adhering to Steven Gundry’s theories more closely if you have autoimmune problems. It is a worthy experiment, in which you are exactly the relevant guinea pig.

5. True fruits: For true fruits, the problem is that sugar is still sugar, even if it is the fructose in fruit that would be extremely healthy if only it were sugar-free. Because of their sugar content, true fruits should be eaten only sparingly. I discuss “The Conundrum of Fruit” in a section of “Forget Calorie Counting; It's the Insulin Index, Stupid.”

The bottom line is that even vegetables and fruit—which have gotten a very good reputation—have both good and bad and borderline foods among them.

Breads and Cereals

Avoid this group. Just look at the tables in “Forget Calorie Counting; It's the Insulin Index, Stupid” and “Using the Glycemic Index as a Supplement to the Insulin Index.” Also, as an additional mark against ready-to-eat breakfast cereal, see what I say in “The Problem with Processed Food.” Cutting out sugar and foods in this category, along with starchy vegetables, is the key first step to weight loss and better health. On that see:

If you avoid all processed foods made with grains and avoid corn and rice (including brown rice), there may be some other whole grains that are OK. Based on the insulin kicks indicated in “Forget Calorie Counting; It's the Insulin Index, Stupid,” I consider steel-cut plain oatmeal as one of the best whole foods to risk, and the only one I trust that is a reasonably common food in the US.

There is substantial debate here. Some experts are more positive about whole grains. But, given the current state of the evidence, I think it is much safer to lean toward the nonstarchy vegetables that almost all experts think are quite healthy (if one leaves aside the botanical fruits).

Ideas Missing from the ‘Four Food Groups’ Advice

Some key bits of advice are simply missing from the discussion of the four food groups. For example, there is fairly wide agreement that high quality olive oil is quite healthy. It goes well with nonstarchy vegetables! Many people like their olive oil with a little vinegar in it, which is good too.

The biggest idea missing from the ‘Four Food Groups’ advice is that evidence is rolling in that when you eat is, if anything, even more important than what you eat for good health. If you are an adult in good health and not pregnant, you should try to restrict your eating to no more than an 8-hour eating window each day. (That probably means skipping breakfast, which is just as well, since most of the typical American breakfast foods these days are quite unhealthy.) But you can ease into that by working first at getting things down to a 12-hour eating window. We simply aren’t designed to have food all the time; that was a pretty rare situation for our distant ancestors. Our bodies need substantial breaks from food in order to refurbish everything. Here are just a few of my posts on that:

For annotated links to other posts on diet and health, see: