Multiple Linear Regression in Minitab

 

 Realtors in Albequerque, NM are interested in predicting the selling price for a home with given

characteristics. To aid in this study, they collected information on 117 randomly selected homes

that were sold in the city. For each house, the attempted to collect information on 8 variables:

 

1.   PRICE: Selling price (thousands $)

2.   SQFT: Square feet of living space

3.   AGE: Age of home (years)

4.   FEATS: Number out of 11 features (dishwasher, fridge, microwave, etc.)

      5.   NE: Located in northeast sector of city (1) or not (0)

      6.   CUST: Custom built (1) or not (0)

      7.   COR: Corner lot (1) or not (0)

      8.   TAX: Annual taxes ($)

 

Can we come up with a regression model that predicts selling price from the size of the home, its

age and the number of features?

 

NOTE: The data set contains information on 117 homes. However, we don't have information on

all 8 variables for all 117 homes. In this case, Minitab can only use the homes on which information

on all variables is available. As you will see, only  68 houses have information on all variables.

 

Regression Analysis: PRICE versus SQFT, AGE, FEATS

 

 

The regression equation is

PRICE = - 28.6 + 0.686 SQFT - 4.01 AGE + 13.8 FEATS

 

 

Predictor        Coef     SE Coef          T        P

Constant       -28.56       99.16      -0.29    0.774

SQFT          0.68578     0.04760      14.41    0.000

AGE            -4.012       1.804      -2.22    0.030

FEATS           13.84       19.00       0.73    0.469

 

S = 183.6       R-Sq = 80.2%     R-Sq(adj) = 79.2%

 

Analysis of Variance

 

Source            DF          SS          MS         F        P

Regression         3     8723986     2907995     86.23    0.000

Residual Error    64     2158410       33725

Total             67    10882396

 

Source       DF      Seq SS

SQFT          1     8511518

AGE           1      194571

FEATS         1       17897

 

Unusual Observations

Obs       SQFT      PRICE         Fit      SE Fit    Residual    St Resid

 15       1928     1170.0      1332.1        78.3      -162.1       -0.98 X

 50       2743     1299.0      1821.4        53.4      -522.4       -2.97R

 89       2116     2100.0      1363.8        39.6       736.2        4.11R

 94       2250     1844.0      1437.0        67.1       407.0        2.38R

 

R denotes an observation with a large standardized residual

X denotes an observation whose X value gives it large influence.

 

Predicted Values for New Observations

 

New Obs     Fit     SE Fit         99.0% CI             99.0% PI

1        1234.9       29.4   (  1156.9,  1313.0)  (   741.2,  1728.7)  

 

Values of Predictors for New Observations

 

New Obs      SQFT       AGE     FEATS

1            1800      10.0      5.00

 

 

 

 

 

Plot of (Standardized) Residuals vs. Predicted Values