Exam 1

Tuesday, March 10, in Class.

Since the exam won’t take the whole class period, I’m planning to do 15-20 minutes additional practice on Arrow Diagrams (“Directed Graphs”) at the beginning of class before beginning the exam. Please make sure to practice FDR calculations as detailed below. We won’t have time to practice those at the beginning of class on Tuesday.

EXAM 1 WILL NOT TEST THE WINNER’S CURSE; PLEASE PRIORITIZE THE OTHER READINGS.

It WILL cover:

Resources for Thinking about Statistical Issues.

PRIORITY: DO THE ARROW DIAGRAM EXERCISE AND THE ARROW BIAS DIAGRAM EXERCISE

  1. Review of Statistics (Fundamentals)

  2. Arrow Diagram Exercise

  3. Answers to Arrow Diagram Exercise

  4. Endogeneity Biases Powerpoint

  5. Arrow Diagram Bias Exercise

  6. Answers to Arrow Diagram Bias Exercise

  7. Review of Statistical Algebra and Calculus Powerpoint

Improved Instructions for accounting for Multiple-Hypothesis Testing using the Benjamini-Hochberg “False-Discovery Rate (FDR) Procedure” by hand (as you will need to do on the exam)

Choose a set of hypotheses that, a priori you think are about equally likely to have you find a rejection of the null hypothesis. If you mix in with the hypotheses you really care about with others almost sure to reject, you will make it too easy to get FDR rejections. If you mix in with the hypotheses you really care about others where you are almost sure to accept the null, you will make it hard to get FDR rejections. Why? You are declaring a set of things FDR significant and saying that on average only x% of that set would have seemed to reject just by chance.

  1. Column 1: Write down the p-values for rejection of each of your hypotheses from smallest to largest: #1 for smallest, #2 for second smallest etc. Let me call that number the “ordinal rank” of the hypothesis.

  2. Column 2: Multiply each p-value by the number of hypotheses

  3. Column 3: Divide each number in Column 2 by the ordinal rank of the hypothesis

  4. Column 4: Find the smallest number in Column 3. Copy it over to row 4 on the same row and to all rows in Column 4 above that. Now, find the smallest number in the rows below that in Column 3. Copy it over to row 4 on the same row and to all rows in Column 4 above that, but below the rows where you have already written a lower number. Keep going with this procedure: smallest number in remaining rows propagates upward in all the blank spots in Column 4.

  5. The numbers in Column 4 are now the FDR significance level for each hypothesis (which is, loosely speaking, a p-value corrected for multiple-hypothesis testing).

Comments on 'The Winners Curse' by Richard Thaler and Alex Imas

  1. I don’t like the name given to “The Confused Subjects Hypothesis.” This is really a “People Would Soon Learn Hypothesis.” To my mind, it is possible for people to be confused and stay confused for a long, long time before ever learning—or never learn at all. “Confused” can be a permanent or semi-permanent condition. When things are cognitively hard, some people may never learn. And things are cognitively very hard, the majority of people may never learn.

Analysis Task

Due by 1 PM Mountain Time, Tuesday, March 24.

I strongly recommend that you do something using UAS data. There is data on a huge number of different variables when you include all the modules and it has the most detailed well-being data of any survey in the world.

Before starting in on your analysis task, take this survey to get a good sense of our UAS module:

https://uas.usc.edu/survey/playground/uas571/test/index.php

What I expect for the analysis task:

Your Analysis Task needs to report the analysis with tables or figures and also have text that clearly explains the analysis. The idea is that this is a first draft of the “Results” section of your term paper.

How to structure your writeup of the Analysis Task:

You can design a different structure, but a typical writeup could look like this:

  1. Here is an interesting question or questions. The answers matter (people care or should care) because: …

  2. Here is a statistical analysis that seems to have some bearing on this question or questions:

  3. On the surface the statistical results seem to say: …

  4. However, the following confounding factors could be giving rise to an illusion, making it seem like something is there that isn’t or that something is bigger or different than it really is.

Don’t forget to talk about the confounding factors! (4.)

Here is a Q&A about the analysis task:

Q:

What is the level of analysis you are expecting for this assignment?

A:

I don't expect you to have consistent estimates of anything, rather to be able to discuss any biases there might be in the estimates you do get, relative to something interesting. Please make the attempt to figure out the sign (+ or -) of any bias you discuss, and say what that would mean for the truth of the interesting thing one might care about. If there are multiple biases, try to figure out the sign of each one, even if all the biases put together can't be signed because some biases are likely to be + and others are likely to be -. Also, discuss whether you think a bias is likely to be large or small.

Especially Frequently Needed Advice for the Analysis Task:

  • Always report p-values. This means you’ll want to do at least some regressions, since that is the easiest way to get p-values. Report the raw p-values, then do the Benjamini-Hochberg FDR adjustment for multiple hypothesis testing if appropriate. (It is confusing if you don’t also report the raw p-values.)

  • Always give the full details of the wordings of the questions and the response options. You can always get these by doing the survey again, but when I am doing that sort of thing I don’t give real answers to the questions, I just click anything until I get to the questions I wanted to look at.

  • I said I love scatterplots, but there is an exception: when one of the two variables has only a few possible values, box plots for the other variable given each of those few possible values are a better way to show the relationship. Note that box plots are a lot like bar charts—but they have more total information in them than bar charts.

  • If you have income in the regression, always use log(income). When it is income ranges (bins) that people say, you should use log(midpoint of the range) as log(income). Using non-logged income is almost guaranteed to get you weird results. And using bin number gives a coefficient that is hard to interpret.

  • If you have log(income) in a regression (and I think it will be household income—all the adult incomes should be counted), I highly recommend using putting log(household size) in as another variable in the regression. That makes sure that you are accounting for a given amount of income being spread over more people while being fairly agnostic about economies of scale in the household.

  • Make sure to discuss causality and to discuss causality in the context of your particular analysis, not just in general terms. What are the likely biases? What is their likely direction? What are some things that are possible but that you don’t think are issues in your particular case?

  • Carefully use non-causal words where you aren’t actually claiming causality. You don’t want to use causal words until you are really ready to discuss causality.

Other Advice for the Analysis Task:

  1. Use lots of graphs. I love scatterplots, but other types of graphs and figures can be good, too.

  2. It’s fine to do some statistics on individual variables, but make sure you do something that relates pairs of variables to each other.

  3. Do some formal statistical tests.

  4. When you test more than one hypothesis, set it up so you can do the multiple hypothesis test correction using the False Discovery Rate procedure!

  5. Make a distinction between being significant at the 5% level and being significant at the 1/2 % level.

  6. If something isn’t statistically significant, you say “I can’t reject the null hypothesis that …” NOT “I reject the alternative hypothesis.” If you want to reject a hypothesis, you have to set it up as a null hypothesis.

  7. Recognize reverse causality and cousin causality, including the consumer-theory-esque model I gave in class of how resources broadly construed help all good things, leading to the general principle (with only a few exceptions) that “All good things are positively correlated.” (This is a statement about the cross section.

  8. Define variables in full. You need to act like your reader doesn’t know what the abbreviations mean. So write out the full text of the aspects, and describe fully all other variables. (You will see that we do this in our papers.)

  9. Don’t order response categories alphabetically! They need to be ordered logically. For example, political leanings should be ordered from Left to Right and levels of education should be ordered from less to more.

  10. When you have interesting results for several variables that are along the same lines, think of creating a simple index to get more statistical power. That is, take simple averages of similar variables and treat that simple average as an index.

  11. Think about how nearly statistically exogenous your right-hand-side variables are. Other things equal, regressions with more nearly statistically exogenous right-hand-side variables are more interesting. That doesn’t mean you can’t do other things. Just think about this dimension.

  12. Think seriously about scale use. Any statistical analysis you do with aspect-of-well-being data you can probably do both with the raw aspect ratings and with (aspect rating - average of calibration questions). Doing both of those analyses will be much more interesting than just the one analysis.

Papers to Read

Introduction

Survey Measurement of Preference Parameters Using Hypothetical Choices (“Stated Preference”)

Process Benefits and Costs

Origins of Preferences

Happiness, Life Satisfaction and Standing on the Ladder of Life (Cantril Ladder) are Not Utility

Happiness Dynamics

Well-Being Indexes

Statistical and Interpretive Issues

Scale-Use Heterogeneity

Optional Papers