Exam 1

Tuesday, March 10, in Class.

UPDATE MARCH 9: NARROWING OF CONTENT. I realized in writing the exam that just asking questions about the assigned readings constituted a lot of questions, so tomorrow I'm only going to test the Layard De Neve book and the many articles I assigned. That will take the whole class period. So to the extent you have more study time, please focus on the readings. 

  • The Layard-De Neve book

EXAM 1 WILL NOT TEST THE WINNER’S CURSE; PLEASE PRIORITIZE THE OTHER READINGS.

NO LONGER ACCURATE:

Since the exam won’t take the whole class period, I’m planning to do 15-20 minutes additional practice on Arrow Diagrams (“Directed Graphs”) at the beginning of class before beginning the exam. Please make sure to practice FDR calculations as detailed below. We won’t have time to practice those at the beginning of class on Tuesday.

Resources for Thinking about Statistical Issues.

PRIORITY: DO THE ARROW DIAGRAM EXERCISE AND THE ARROW BIAS DIAGRAM EXERCISE

  1. Review of Statistics (Fundamentals)

  2. Arrow Diagram Exercise

  3. Answers to Arrow Diagram Exercise

  4. Endogeneity Biases Powerpoint

  5. Arrow Diagram Bias Exercise

  6. Answers to Arrow Diagram Bias Exercise

  7. Review of Statistical Algebra and Calculus Powerpoint

Improved Instructions for accounting for Multiple-Hypothesis Testing using the Benjamini-Hochberg “False-Discovery Rate (FDR) Procedure” by hand (as you will need to do on the exam)

Choose a set of hypotheses that, a priori you think are about equally likely to have you find a rejection of the null hypothesis. If you mix in with the hypotheses you really care about with others almost sure to reject, you will make it too easy to get FDR rejections. If you mix in with the hypotheses you really care about others where you are almost sure to accept the null, you will make it hard to get FDR rejections. Why? You are declaring a set of things FDR significant and saying that on average only x% of that set would have seemed to reject just by chance.

  1. Column 1: Write down the p-values for rejection of each of your hypotheses from smallest to largest: #1 for smallest, #2 for second smallest etc. Let me call that number the “ordinal rank” of the hypothesis.

  2. Column 2: Multiply each p-value by the number of hypotheses

  3. Column 3: Divide each number in Column 2 by the ordinal rank of the hypothesis

  4. Column 4: Find the smallest number in Column 3. Copy it over to row 4 on the same row and to all rows in Column 4 above that. Now, find the smallest number in the rows below that in Column 3. Copy it over to row 4 on the same row and to all rows in Column 4 above that, but below the rows where you have already written a lower number. Keep going with this procedure: smallest number in remaining rows propagates upward in all the blank spots in Column 4.

  5. The numbers in Column 4 are now the FDR significance level for each hypothesis (which is, loosely speaking, a p-value corrected for multiple-hypothesis testing).