next up previous
Next: About this document ...

Statistics 2501 (001)
Assignment #3: Oct. 8, 2003
Due in class: Oct. 20, 2003




All problem numbers are from the textbook Statistics for Business and Economics, 8th Edition.

If a question does not specify whether it should be done by hand or by Minitab, you can use whichever method you prefer.

  1. A university is trying to develop a formal system of deciding which students to admit. The university believes that both grades and extracurricular activities determine how likely students are to succeed. The university randomly sampled 100 fourth-year students and recorded:

    GPA for the first 3 years of university (0-12 range) ($y$)

    GPA from high school (0-12 range) $(x_{1})$

    SAT score (200-800 range) $(x_{2})$

    Number of hours per week (on average) involved in organized extracurricular activities in the last year of high school. $(x_{3})$

    You can access the data in the file admit.mtw by selecting

    Pub on `CS-thebe' and stat2501

    when you open a worksheet in Minitab.

    1. Assuming the multiple regression model

      \begin{displaymath}
y = \beta_{0} + \beta_{1}x_{1} + \beta_{2}x_{2} +
+ \beta_{3}x_{3} + e
\end{displaymath}

      is appropriate, find the least squares regression line for this data.

    2. Interpret $\hat{\beta}_{2}$ in the context of this problem.

    3. Is the model useful in predicting university GPA? Base your answer on the p-value of the appropriate test.

    4. Test at $\alpha = 0.01$ if extracurricular activities should be kept in the model.

    5. Find a 90% prediction interval for university GPA of a student that had a 550 SAT score, a GPA of 9 in high school, and took part in 8 hours of extracurricular activities per week.

    6. Plot the residuals vs. the $\hat{y}$ values, and construct a QQ-plot of the residuals. Interpret the plots.

    7. For the regression model

      \begin{displaymath}
y = \beta_{0} + \beta_{1}x_{1} + \beta_{2}x_{2} +
+ \beta_{3}x_{3} + \beta_{4}x_{1}x_{2} + e
\end{displaymath}

      find the least squares line. Is the $x_{1}x_{2}$ interaction term needed? Base your conclusion on the p-value of the appropriate test.

  2. Refer to the data on drywall sales that you used in assignment #2, and answer the following, assuming you are starting with a regression equation that is using all of the explanatory variables (i.e. the regression equation you used in assignment #2):

    1. Can we drop the apartment vacancy rate from the model? Base your conclusion on the p-value of the appropriate test.

    2. Construct a 99% CI for the mean monthly sales if 40 building permits were issued, mortgage rates were 8.5%, and vacancy rates for apartments and office buildings were 2% and 10%, respectively.

    3. Plot the residuals vs. the $\hat{y}$ values, and construct a QQ-plot of the residuals. Interpret the plots.

  3. Refer to the data in problem 10.35, p. 484. You can access the data in the file
    ceopay.mtw by selecting

    Pub on `CS-thebe' and stat2501

    when you open a worksheet in Minitab.

    Use the data to complete the following:

    1. Plot the data, using pay as the response variable. Is there any point that looks ``unusual'' on the graph?

    2. Get Minitab to plot the least squares line relating pay ($y$) and performance ($x$) on top of the plot of the data. Also, interpret the value of $R^{2}$ that is reported.

    3. Plot the residuals vs. the $\hat{y}$ values, and interpret this plot.

    4. Remove the point that looked ``unusual'' in (a) from your dataset. Using this new dataset, repeat parts (b)-(c). Have any of your conclusions changed, now that this point has been removed?

  4. Problem 11.52, p. 590. Construct the plot by hand. The output that is needed is given in the text.




next up previous
Next: About this document ...
Gary Sneddon 2003-10-17