Analysis Task: Due 11 PM Thursday, April 10, 2025
FOR ACCESS TO THE DATA, LOOK AT THIS README FILE
Do the Survey Links assignment before trying to think about the analysis task. You need to know what kind of data will be available. (You could use other data sets related to well-being or Behavioral Economics, but I don’t recommend it. We’re set up to help you with this data, and all of it is highly related to the course.) Our goal is to get the data available for you by the Wednesday of Spring Break, but that timeline might slip.
Your Analysis Task needs to report the analysis with tables or figures and also have text that clearly explains the analysis. The idea is that this is like one section of a paper.
If you have an idea of what to do for the analysis task, just send me and Colby (colbychambers4@gmail.com) an email and I'll give a reaction of how interesting I think it is, and maybe a suggestion for a tweak.
Seeing the analysis and its explanation as one section of your term paper. (Your term paper is due later, at 11 PM on Wednesday, May 3—the evening after the last class.) The idea is to make this analysis part of a larger discussion.
Including figures and tables, the analysis task should be at least 5 pages. I'll take a risk and not put an upper limit on the length of the analysis task. (The term paper beyond the analysis task should be between 5 and 10 pages, with closer to 5 being preferred.)
How to structure your writeup of the Analysis Task:
You can design a different structure, but a typical writeup could look like this:
Here is an interesting question or questions. The answers matter (people care or should care) because: …
Here is a statistical analysis that seems to have some bearing on this question or questions:
On the surface the statistical results seem to say: …
However, the following confounding factors could be giving rise to an illusion, making it seem like something is there that isn’t or that something is bigger or different than it really is.
Don’t forget to talk about the confounding factors! (4.)
Here is a Q&A about the analysis task:
Q:
What is the level of analysis you are expecting for this assignment? I’ve taken some stats classes, so I’m familiar with hypothesis testing and regression, but since this class doesn’t have a stats prerequisite I’m not sure how in depth I should go for this assignment.
Since most aspects of wellbeing are correlated with each other, it seems to difficult to use regression to analyze relationships between these aspects without running into reverse causality, cousin causality, or both. My knowledge of stats isn’t sufficient to avoid these problems in cases where instrumental variable regression isn’t a viable alternative. I’m wondering what you would suggest that I do to avoid this issue.
A:
At the low end, it could be simply some scatter plots or bar charts or other interesting graphs.
I don't expect you to have consistent estimates of anything, rather to be able to discuss any biases there might be in the estimates you do get, relative to something interesting. Please make the attempt to figure out the sign (+ or -) of any bias you discuss, and say what that would mean for the truth of the interesting thing one might care about. If there are multiple biases, try to figure out the sign of each one, even if all the biases put together can't be signed because some biases are likely to be + and others are likely to be -. Also, discuss whether you think a bias is likely to be large or small.
Advice for the Analysis Task:
Use lots of graphs. I love scatterplots, but other types of graphs and figures can be good, too.
It’s fine to do some statistics on individual variables, but make sure you do something that relates pairs of variables to each other.
Do some formal statistical tests.
When you test more than one hypothesis, set it up so you can do the multiple hypothesis test correction using the False Discovery Rate procedure!
Make a distinction between being significant at the 5% level and being significant at the 1/2 % level.
If something isn’t statistically significant, you say “I can’t reject the null hypothesis that …” NOT “I reject the alternative hypothesis.” If you want to reject a hypothesis, you have to set it up as a null hypothesis.
Recognize reverse causality and cousin causality, including the consumer-theory-esque model I gave in class of how resources broadly construed help all good things, leading to the general principle (with only a few exceptions) that “All good things are positively correlated.” (This is a statement about the cross section.
Define variables in full. You need to act like your reader doesn’t know what the abbreviations mean. So write out the full text of the aspects, and describe fully all other variables. (You will see that we do this in our papers.)
Don’t order response categories alphabetically! They need to be ordered logically. For example, political leanings should be ordered from Left to Right and levels of education should be ordered from less to more.
When you have interesting results for several variables that are along the same lines, think of creating a simple index to get more statistical power. That is, take simple averages of similar variables and treat that simple average as an index.
Think about how nearly statistically exogenous your right-hand-side variables are. Other things equal, regressions with more nearly statistically exogenous right-hand-side variables are more interesting. That doesn’t mean you can’t do other things. Just think about this dimension.
Think seriously about scale use. Any statistical analysis you do with aspect-of-well-being data you can probably do both with the raw aspect ratings and with (aspect rating - average of calibration questions). Doing both of those analyses will be much more interesting than just the one analysis.