Prospective and Retrospective Studies Read pg. 542-545 for more details. Consider the following examples: Ex 1: The Salk vaccine data, which we have analyzed. In this example, we have an explanatory variable (vaccine) with 2 levels (vaccine or placebo), and a response variable with 2 levels (paralysis or no paralysis). How was the data collected? The children were randomly assigned to the levels of the explanatory variable, then they were tracked to record their value of the response variable, i.e. whether the contracted polio or not. Ex 2: A group of hospital patients is sampled. They fall into 2 groups: those with lung cancer, and those who do not. The patients are interviewed to find out if they smoke or not. The natural question is to study if smoking status helps to predict if a person has lung cancer. The explanatory variable (smoking status) has 2 levels, and the response variable (cancer status) has 2 levels. How was the data collected? The subjects were sampled according to their value of the response variable (did they have cancer or not), then their value of the explanatory was determined. Example 1 is an example of a prospective study. Example 2 is an example of a retrospective study. Both methods have benefits (pg. 543-544). KEY POINT: To describe outcomes of the explanatory variable in a retrospective study, only the odds ratio can be used. For example, in the smoking problem above, we couldn't use data collected in this problem to estimate the proportion of smokers who get lung cancer. In retrospective studies, it is possible to estimate a conditional probability P(X = i|Y = j), but it is usually not possible to estimate P(Y=j|X=i) of an outcome of interest, or the difference for that outcome. It is possible to estimate the odds ratio, since it is determined by conditional probabilities in either direction. In the cancer example above, subjects are known to have cancer or not, so there is no information about the proporition of subjects getting cancer. For a more detailed (and theoretical) discussion of these types of studies, see: McCullagh, P. and Nelder, J.A. (1989). Generalized Linear Models, 2nd edition. Chapman and Hall, New York. pages 111-113.