next up previous
Next: About this document ...

Statistics 2501 (001)
Assignment #1: Solutions



  1. Problem on the sampling distribution of $\bar{x}$.

    NOTE: No computer output is required in the solutions. Also, not everyone will have the same results, since the values are randomly generated.

    1. No answer needed.

    2. From the output, we see that $\mu = 16$ and $\sigma = 5$.

    3. No answer needed.

    4. The histogram that I was shown did not have much of a shape at all (results will vary).

    5. From my output, the mean of my $\bar{x}$ values was 17.48, and the standard deviation was 1.08 (again, everyone will have different answers).

      The Central Limit Theorem states that we'd expect the mean to be $\mu = 16$ and the standard deviation to be $\sigma/\sqrt{n} = 5/\sqrt{20} = 1.12$. For my computer results, the mean isn't that close to 16, although the standard deviation isn't that far from what the theory predicts.

    6. No response needed.

    7. My histogram, based on 1000 samples, appeared symmetric (again, results will vary).

    8. From the output, the mean is 16.06 and the standard deviation is 1.16. Although the mean is much closer to what the Central Limit Theorem predicts, the standard deviation hasn't improved that much. I guess that goes to show the theory tells us what to expect will happen, and not what is guaranteed to happen.

  2. Refer to #8.26 in text, but use $\sigma = 6.5$.

    1. 8.26(a). Is there evidence that the average age of viewers is greater than 50?
      $H_{o}: \mu = 50$
      $H_{a}: \mu > 50$

      \begin{displaymath}
\mbox{Test statistic: } z_{obs} = \frac{\bar{x} - \mu_{o}}
{\sigma/\sqrt{n} }
=\frac{51.3 - 50}{6.5/\sqrt{50}}
= 1.41
\end{displaymath}

      The question was a bit unclear of what was expected at this point. Give full marks if a student reports the hypotheses and test statistic correctly. You can ignore anything else.

    2. Since this is an upper-tailed test:

      \begin{displaymath}
\mbox{p--value }= P(z \geq 1.41) = P(z > 0) - P(0 \leq z \leq 1.41)
= 0.5 - 0.4207 = 0.0793
\end{displaymath}

      using the normal curve table on the inside front cover of the text.

      We conclude that there is weak (or mild) evidence against $H_{o}$.

      Therefore there is mild evidence to suggest that the mean age of viewers is greater than 50 years.

    3. Find a 90% CI for the mean age.

      90% CI implies $\alpha = 0.1$, so $\alpha/2 = 0.05$. Hence we need $z_{.05}$.

      From the normal curve table, use $z_{.05} \approx 1.64$, $z_{.05} \approx 1.65$ or $z_{.05} \approx 1.645$. (Full credit will be given for any of these 3).

      Then the 90% CI is:

      \begin{displaymath}
\bar{x} \pm z_{\alpha/2} \frac{\sigma}{\sqrt{n}}
= 51.3 \p...
...rac{6.5}{\sqrt{50} } \right)
= 51.3 \pm 1.52 = (49.78, 52.82)
\end{displaymath}

  3. #10.71(a, b) in text, plus one other part.

    1. Plot data by hand. The plot is attached on a separate page. We see that, as order size increases, the time to fill the order increases. A linear relationship could be a reasonable assumption, based on the plot.

      NOTE: The (a) part of the question is a bit confusing in determining which variable is $x$ and which is $y$. We want $x$ = number of cases in the order and $y$ = time, which is stated explicitly in (b). If a student has them reversed in (a), just take off a maximum of one (1) point.

    2. Rounding will make results slightly different from mine. Don't take any points off for small differences from these solutions.


      \begin{displaymath}
n = 9, \quad \bar{x} = (36 + \ldots + 96)/9 = 127.67, \quad
\bar{y} = (27 + \ldots + 10)/9 = 26.56
\end{displaymath}


      \begin{displaymath}
\sum_{i=1}^{n} x_{i}^{2} = 36^2 + \ldots + 96^2 = 398,979, ...
...\sum_{i=1}^{n} x_{i}y_{i} = 36(27) + \ldots + 96(10) = 58,102
\end{displaymath}


      \begin{displaymath}
SS_{xy} = \sum_{i=1}^{n} x_{i}y_{i} - n\bar{x}\bar{y}
= 58,102 - 9(127.67)(26.56)
= 27583.76
\end{displaymath}


      \begin{displaymath}
SS_{xx} = \sum_{i=1}^{n} x_{i}^{2} - n(\bar{x})^{2}
= 398979 - 9( 127.67)^{2} = 252290
\end{displaymath}

      Therefore

      \begin{displaymath}
\hat{\beta}_{1} = SS_{xy}/SS_{xx} = 27583.76/252290 = 0.109
\end{displaymath}


      \begin{displaymath}
\hat{\beta}_{0} = \bar{y} - \hat{\beta}_{1} \bar{x}
= 26.56 - 0.109(127.67) = 12.64
\end{displaymath}

      So the least squares line is

      \begin{displaymath}
\hat{y} = 12.64 + 0.109x
\end{displaymath}

    3. Predict time to fill an order of 100 cases:

      \begin{displaymath}
\hat{y} = 12.64 + 0.109(100) = 23.54
\end{displaymath}

  4. Refer to 10.20, p. 471-472.

    1. Plot the data. A plot from Minitab is attached.

      From the plot, we see that there is not a strong relationship betweeen household income and the retail sales. In particular, it does not really appear that there is a linear relationship between income and sales.

    2. Find the least squares line.

      From the Minitab output below, we see that the least squares line is:

      \begin{displaymath}
\hat{y} = 1852 + 0.0109 x
\end{displaymath}

      Regression Analysis: Sales versus Income
      
      The regression equation is
      Sales = 1852 + 0.0109 Income
      
      Predictor        Coef     SE Coef          T        P
      Constant       1852.1       585.4       3.16    0.009
      Income        0.01095     0.01268       0.86    0.406
      
      S = 388.5       R-Sq = 6.4%      R-Sq(adj) = 0.0%
      
      Analysis of Variance
      
      Source            DF          SS          MS         F        P
      Regression         1      112587      112587      0.75    0.406
      Residual Error    11     1660261      150933
      Total             12     1772848
      
      Unusual Observations
      Obs     Income      Sales         Fit      SE Fit    Residual    St Resid
       10      44571       3215        2340         108         875        2.34R 
      
      R denotes an observation with a large standardized residual
      

    3. Plot regression line on scatter plot. Probably the easiest way to do this is to choose two $x$ values, then use the regression line to calculate the corresponding $y$ values on the line, and connect them.

      For example, if $x = 40000$, $y = 1852 + 0.0109(40000) =2288 $.

      If $x = 45000$, $y = 1852 + 0.0109(45000) = 2342.5$.

      Then connect (40000, 2288) and (45000, 2342.5).

    4. Interpret $\hat{\beta}_{0}$ and $\hat{\beta}_1$:

      $\hat{\beta}_1 = 0.0109$. This is the slope estimate. This tells us that as average household income rises by 1 dollar (1 unit), we predict retail sales to increase by 0.0109, or about 1 cent per household.

      $\hat{\beta}_{0} = 1852.1$. This is the y-intercept estimate. This tells us that if the average income is 0, we predict retail sales would be $1852.10. This clearly doesn't make any sense. We shouldn't be very concerned with interpreting beta0.

    5. Predict sales if the income is $41,400.

      \begin{displaymath}
\hat{y} = 1852 + 0.0109(41400) = 2303.26
\end{displaymath}

      NOTE: This question originally had an error in the assignment question, which Mark corrected in the lab. If the incorrect value of $2200 was used in this question, one (1) point will be lost.




next up previous
Next: About this document ...
Gary Sneddon 2003-09-24