Below is some SHAZAM output associated with the questions in this problem set. Note that this was the first time you were expected to write your own SHAZAM programs completely from scratch. Prior assignments provided a *.sha program to guide you, but things are now getting more realistic. Some people have discovered that while the order of explanatory variables in an OLS command does not matter, the order of variables in a read statement is crucial, since this must match the order of appearance of variables in the data file being read.
|_sample 1 36
|_read(prodtn.dat) q usk sk
UNIT 88 IS NOW ASSIGNED TO: prodtn.dat
3 VARIABLES AND 36 OBSERVATIONS STARTING AT OBS 1
Always do a STAT to ensure that your data are what you expect them to be.
|_stat / pcor
NAME N MEAN ST. DEV VARIANCE MINIMUM MAXIMUM
Q 36 142.94 61.833 3823.4 14.297 272.70
USK 36 52.417 31.465 990.02 10.000 110.00
SK 36 7.0000 2.2678 5.1429 4.0000 10.000
CORRELATION MATRIX OF VARIABLES - 36 OBSERVATIONS
Q 1.0000
USK 0.80422 1.0000
SK 0.37603 0.34035E-01 1.0000
Q USK SK
|_* a.) linear model
|_ols q usk sk
REQUIRED MEMORY IS PAR= 3 CURRENT PAR= 500
OLS ESTIMATION
36 OBSERVATIONS DEPENDENT VARIABLE = Q
...NOTE..SAMPLE RANGE SET TO: 1, 36
R-SQUARE = 0.7685 R-SQUARE ADJUSTED = 0.7544
VARIANCE OF THE ESTIMATE-SIGMA**2 = 938.88
STANDARD ERROR OF THE ESTIMATE-SIGMA = 30.641
SUM OF SQUARED ERRORS-SSE= 30983.
MEAN OF DEPENDENT VARIABLE = 142.94
LOG OF THE LIKELIHOOD FUNCTION = -172.720
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 0.10283E+06 2. 51417. 54.764
ERROR 30983. 33. 938.88 P-VALUE
TOTAL 0.13382E+06 35. 3823.4 0.000
ANALYSIS OF VARIANCE - FROM ZERO
SS DF MS F
REGRESSION 0.83838E+06 3. 0.27946E+06 297.654
ERROR 30983. 33. 938.88 P-VALUE
TOTAL 0.86937E+06 36. 24149. 0.000
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 33 DF P-VALUE CORR. COEFFICIENT AT MEANS
USK 1.5571 0.1647 9.454 0.000 0.855 0.7923 0.5710
SK 9.5176 2.285 4.165 0.000 0.587 0.3491 0.4661
CONSTANT -5.2993 18.63 -0.2844 0.778-0.049 0.0000 -0.0371
Individually, the coefficients on both USK and SK are
strongly statistically significantly different from zero.
|_* b.) quadratic models
|_genr usk2=usk*usk
|_genr sk2=sk*sk
|_ols q usk usk2 sk
REQUIRED MEMORY IS PAR= 4 CURRENT PAR= 500
OLS ESTIMATION
36 OBSERVATIONS DEPENDENT VARIABLE = Q
...NOTE..SAMPLE RANGE SET TO: 1, 36
R-SQUARE = 0.9153 R-SQUARE ADJUSTED = 0.9073
VARIANCE OF THE ESTIMATE-SIGMA**2 = 354.37
STANDARD ERROR OF THE ESTIMATE-SIGMA = 18.825
SUM OF SQUARED ERRORS-SSE= 11340.
MEAN OF DEPENDENT VARIABLE = 142.94
LOG OF THE LIKELIHOOD FUNCTION = -154.628
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 0.12248E+06 3. 40826. 115.207
ERROR 11340. 32. 354.37 P-VALUE
TOTAL 0.13382E+06 35. 3823.4 0.000
ANALYSIS OF VARIANCE - FROM ZERO
SS DF MS F
REGRESSION 0.85803E+06 4. 0.21451E+06 605.322
ERROR 11340. 32. 354.37 P-VALUE
TOTAL 0.86937E+06 36. 24149. 0.000
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 32 DF P-VALUE CORR. COEFFICIENT AT MEANS
USK 4.6751 0.4308 10.85 0.000 0.887 2.3790 1.7144
USK2 -0.26160E-01 0.3514E-02 -7.445 0.000 -0.796 -1.6317 -0.6790
SK 9.0153 1.406 6.414 0.000 0.750 0.3306 0.4415
CONSTANT -68.166 14.22 -4.793 0.000-0.646 0.0000 -0.4769
Quadratic term in USK makes a significant contribution to the model.
|_ols q usk sk sk2
REQUIRED MEMORY IS PAR= 4 CURRENT PAR= 500
OLS ESTIMATION
36 OBSERVATIONS DEPENDENT VARIABLE = Q
...NOTE..SAMPLE RANGE SET TO: 1, 36
R-SQUARE = 0.7689 R-SQUARE ADJUSTED = 0.7472
VARIANCE OF THE ESTIMATE-SIGMA**2 = 966.48
STANDARD ERROR OF THE ESTIMATE-SIGMA = 31.088
SUM OF SQUARED ERRORS-SSE= 30927.
MEAN OF DEPENDENT VARIABLE = 142.94
LOG OF THE LIKELIHOOD FUNCTION = -172.688
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 0.10289E+06 3. 34297. 35.486
ERROR 30927. 32. 966.48 P-VALUE
TOTAL 0.13382E+06 35. 3823.4 0.000
ANALYSIS OF VARIANCE - FROM ZERO
SS DF MS F
REGRESSION 0.83844E+06 4. 0.20961E+06 216.881
ERROR 30927. 32. 966.48 P-VALUE
TOTAL 0.86937E+06 36. 24149. 0.000
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 32 DF P-VALUE CORR. COEFFICIENT AT MEANS
USK 1.5569 0.1671 9.317 0.000 0.855 0.7922 0.5709
SK 5.1606 18.28 0.2823 0.780 0.050 0.1893 0.2527
SK2 0.31122 1.295 0.2403 0.812 0.042 0.1611 0.1176
CONSTANT 8.4031 60.08 0.1399 0.890 0.025 0.0000 0.0588
When you add a quadratic term in SK, neither the coefficient on SK nor that on SK2 is individually statistically significantly different from zero.|_test |_test sk=0 |_test sk2=0 |_endThis test automates the process of doing two regressions (one unrestricted, and one restricted) finding the explained sum of squares from each model's Analysis of Variance from Means table, taking the difference, dividing by the number of restrictions by which the models differ, and then dividing the whole thing by the error variance of the unrestricted model.
F STATISTIC = 8.4544796 WITH 2 AND 32 D.F. P-VALUE= 0.00113 WALD CHI-SQUARE STATISTIC = 16.908959 WITH 2 D.F. P-VALUE= 0.00021 UPPER BOUND ON P-VALUE BY CHEBYCHEV INEQUALITY = 0.11828Since the p-value is LESS than 0.05, we easily reject the hypothesis that both coefficients (on SK and SK2) could be jointly zero. This is consistent with the finding that the coefficient on SK when it is used alone is strongly different from zero.
|_ols q usk usk2 sk sk2 / coef=bSaving the fitted coefficients allows us to refer to them later on...
REQUIRED MEMORY IS PAR= 4 CURRENT PAR= 500
OLS ESTIMATION
36 OBSERVATIONS DEPENDENT VARIABLE = Q
...NOTE..SAMPLE RANGE SET TO: 1, 36
R-SQUARE = 0.9155 R-SQUARE ADJUSTED = 0.9046
VARIANCE OF THE ESTIMATE-SIGMA**2 = 364.61
STANDARD ERROR OF THE ESTIMATE-SIGMA = 19.095
SUM OF SQUARED ERRORS-SSE= 11303.
MEAN OF DEPENDENT VARIABLE = 142.94
LOG OF THE LIKELIHOOD FUNCTION = -154.569
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 0.12251E+06 4. 30629. 84.004
ERROR 11303. 31. 364.61 P-VALUE
TOTAL 0.13382E+06 35. 3823.4 0.000
ANALYSIS OF VARIANCE - FROM ZERO
SS DF MS F
REGRESSION 0.85806E+06 5. 0.17161E+06 470.674
ERROR 11303. 31. 364.61 P-VALUE
TOTAL 0.86937E+06 36. 24149. 0.000
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 31 DF P-VALUE CORR. COEFFICIENT AT MEANS
USK 4.6736 0.4371 10.69 0.000 0.887 2.3782 1.7138
USK2 -0.26148E-01 0.3564E-02 -7.336 0.000-0.797 -1.6310 -0.6787
SK 5.4745 11.23 0.4875 0.629 0.087 0.2008 0.2681
SK2 0.25294 0.7957 0.3179 0.753 0.057 0.1309 0.0956
CONSTANT -57.003 37.97 -1.501 0.143-0.260 0.0000 -0.3988
|_* calculate values of usk and sk where derivative goes to zero
The marginal productivity of USK changes over the range of the data. The
quantity of USK that produces the most output is about 89. While there is no statistical evidence of curvature with respect to SK, we have included a quadratic term anyway. A minimum occurs at -10.8 units, so the relevant part of the curve is rising in SK, but the whole curvature story is not really warranted in the SK direction.
|_gen1 uskstar=-(b:1)/(2*b:2)
|_gen1 skstar=-(b:3)/(2*b:4)
|_print uskstar skstar
USKSTAR
89.36683
SKSTAR
-10.82188
|_* c.)
|_genr usksk=usk*sk
The interaction term means each derivative of q with respect to usk and sk depends on the level of the "other" input.
|_ols q usk usk2 sk usksk
REQUIRED MEMORY IS PAR= 5 CURRENT PAR= 500
OLS ESTIMATION
36 OBSERVATIONS DEPENDENT VARIABLE = Q
...NOTE..SAMPLE RANGE SET TO: 1, 36
R-SQUARE = 0.9294 R-SQUARE ADJUSTED = 0.9203
VARIANCE OF THE ESTIMATE-SIGMA**2 = 304.81
STANDARD ERROR OF THE ESTIMATE-SIGMA = 17.459
SUM OF SQUARED ERRORS-SSE= 9449.0
MEAN OF DEPENDENT VARIABLE = 142.94
LOG OF THE LIKELIHOOD FUNCTION = -151.344
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 0.12437E+06 4. 31092. 102.006
ERROR 9449.0 31. 304.81 P-VALUE
TOTAL 0.13382E+06 35. 3823.4 0.000
ANALYSIS OF VARIANCE - FROM ZERO
SS DF MS F
REGRESSION 0.85992E+06 5. 0.17198E+06 564.238
ERROR 9449.0 31. 304.81 P-VALUE
TOTAL 0.86937E+06 36. 24149. 0.000
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 31 DF P-VALUE CORR. COEFFICIENT AT MEANS
USK 3.9500 0.4944 7.989 0.000 0.820 2.0100 1.4485
USK2 -0.26178E-01 0.3259E-02 -8.033 0.000-0.822 -1.6329 -0.6795
SK 3.5029 2.569 1.364 0.182 0.238 0.1285 0.1715
USKSK 0.10487 0.4211E-01 2.491 0.018 0.408 0.4439 0.2709
CONSTANT -30.230 20.15 -1.500 0.144-0.260 0.0000 -0.2115
The interaction term's coefficient is statistically significantly different from zero, so each derivative (marginal productivity) increases in the amount of
the other input. Note that with the interaction term, the linear effect of SK drops into insignificance.|_* d.) |_genr lq=log(q) |_genr lusk=log(usk) |_genr lsk=log(sk) |_genr lusk2=lusk*lusk |_genr lsk2=lsk*lsk |_genr lusklsk=lusk*lskIncluding the LOGLOG option on the ols command allows comparison of the maximized log-likelihood values for models that use q and lq as dependent variables. If you don't tell SHAZAM that the dependent variable is a logged quantity, it has no way of knowing. With the option in place, the results of the particular regression are not affected, but the reported maximized log-likelihood (and elasticities) are computed differently.
|_ols lq lusk lsk / loglog
REQUIRED MEMORY IS PAR= 6 CURRENT PAR= 500
OLS ESTIMATION
36 OBSERVATIONS DEPENDENT VARIABLE = LQ
...NOTE..SAMPLE RANGE SET TO: 1, 36
R-SQUARE = 0.8318 R-SQUARE ADJUSTED = 0.8216
VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.71946E-01
STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.26823
SUM OF SQUARED ERRORS-SSE= 2.3742
MEAN OF DEPENDENT VARIABLE = 4.8187
LOG OF THE LIKELIHOOD FUNCTION(IF DEPVAR LOG) = -175.616
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 11.742 2. 5.8710 81.603
ERROR 2.3742 33. 0.71946E-01 P-VALUE
TOTAL 14.116 35. 0.40332 0.000
ANALYSIS OF VARIANCE - FROM ZERO
SS DF MS F
REGRESSION 847.66 3. 282.55 3927.292
ERROR 2.3742 33. 0.71946E-01 P-VALUE
TOTAL 850.04 36. 23.612 0.000
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 33 DF P-VALUE CORR. COEFFICIENT AT MEANS
LUSK 0.76361 0.6317E-01 12.09 0.000 0.903 0.8636 0.7636
LSK 0.48040 0.1306 3.679 0.001 0.539 0.2628 0.4804
CONSTANT 1.0518 0.3384 3.108 0.004 0.476 0.0000 1.0518
|_ols lq lusk lusk2 lsk / loglog
REQUIRED MEMORY IS PAR= 6 CURRENT PAR= 500
OLS ESTIMATION
36 OBSERVATIONS DEPENDENT VARIABLE = LQ
...NOTE..SAMPLE RANGE SET TO: 1, 36
R-SQUARE = 0.9134 R-SQUARE ADJUSTED = 0.9053
VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.38197E-01
STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.19544
SUM OF SQUARED ERRORS-SSE= 1.2223
MEAN OF DEPENDENT VARIABLE = 4.8187
LOG OF THE LIKELIHOOD FUNCTION(IF DEPVAR LOG) = -163.665
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 12.894 3. 4.2980 112.521
ERROR 1.2223 32. 0.38197E-01 P-VALUE
TOTAL 14.116 35. 0.40332 0.000
ANALYSIS OF VARIANCE - FROM ZERO
SS DF MS F
REGRESSION 848.81 4. 212.20 5555.508
ERROR 1.2223 32. 0.38197E-01 P-VALUE
TOTAL 850.04 36. 23.612 0.000
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 32 DF P-VALUE CORR. COEFFICIENT AT MEANS
LUSK 3.0776 0.4239 7.261 0.000 0.789 3.4806 3.0776
LUSK2 -0.32650 0.5946E-01 -5.492 0.000-0.697 -2.6326 -0.3265
LSK 0.48512 0.9515E-01 5.099 0.000 0.670 0.2654 0.4851
CONSTANT -2.8802 0.7573 -3.803 0.001-0.558 0.0000 -2.8802
|_ols lq lusk lsk lsk2 / loglog
REQUIRED MEMORY IS PAR= 6 CURRENT PAR= 500
OLS ESTIMATION
36 OBSERVATIONS DEPENDENT VARIABLE = LQ
...NOTE..SAMPLE RANGE SET TO: 1, 36
R-SQUARE = 0.8344 R-SQUARE ADJUSTED = 0.8189
VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.73056E-01
STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.27029
SUM OF SQUARED ERRORS-SSE= 2.3378
MEAN OF DEPENDENT VARIABLE = 4.8187
LOG OF THE LIKELIHOOD FUNCTION(IF DEPVAR LOG) = -175.338
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 11.778 3. 3.9261 53.742
ERROR 2.3378 32. 0.73056E-01 P-VALUE
TOTAL 14.116 35. 0.40332 0.000
ANALYSIS OF VARIANCE - FROM ZERO
SS DF MS F
REGRESSION 847.70 4. 211.92 2900.855
ERROR 2.3378 32. 0.73056E-01 P-VALUE
TOTAL 850.04 36. 23.612 0.000
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 32 DF P-VALUE CORR. COEFFICIENT AT MEANS
LUSK 0.76283 0.6366E-01 11.98 0.000 0.904 0.8627 0.7628
LSK -0.78744 1.800 -0.4375 0.665-0.077 -0.4308 -0.7874
LSK2 0.34549 0.4892 0.7062 0.485 0.124 0.6955 0.3455
CONSTANT 2.1762 1.628 1.337 0.191 0.230 0.0000 2.1762
As before curvature in the direction of SK does not seem to be present.
|_ols lq lusk lusk2 lsk lsk2 / loglog
REQUIRED MEMORY IS PAR= 7 CURRENT PAR= 500
OLS ESTIMATION
36 OBSERVATIONS DEPENDENT VARIABLE = LQ
...NOTE..SAMPLE RANGE SET TO: 1, 36
R-SQUARE = 0.9159 R-SQUARE ADJUSTED = 0.9051
VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.38295E-01
STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.19569
SUM OF SQUARED ERRORS-SSE= 1.1871
MEAN OF DEPENDENT VARIABLE = 4.8187
LOG OF THE LIKELIHOOD FUNCTION(IF DEPVAR LOG) = -163.140
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 12.929 4. 3.2323 84.404
ERROR 1.1871 31. 0.38295E-01 P-VALUE
TOTAL 14.116 35. 0.40332 0.000
ANALYSIS OF VARIANCE - FROM ZERO
SS DF MS F
REGRESSION 848.85 5. 169.77 4433.199
ERROR 1.1871 31. 0.38295E-01 P-VALUE
TOTAL 850.04 36. 23.612 0.000
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 31 DF P-VALUE CORR. COEFFICIENT AT MEANS
LUSK 3.0756 0.4244 7.246 0.000 0.793 3.4783 3.0756
LUSK2 -0.32632 0.5953E-01 -5.481 0.000-0.702 -2.6312 -0.3263
LSK -0.76019 1.303 -0.5833 0.564-0.104 -0.4159 -0.7602
LSK2 0.33935 0.3542 0.9581 0.345 0.170 0.6832 0.3394
CONSTANT -1.7736 1.382 -1.284 0.209-0.225 0.0000 -1.7736
Curvature in the USK dimension is present, but apparently not in the SK direction. No diminishing MP of sk within the range of the data.
|_ols lq lusk lusk2 lsk lsk2 lusklsk / loglog
REQUIRED MEMORY IS PAR= 7 CURRENT PAR= 500
OLS ESTIMATION
36 OBSERVATIONS DEPENDENT VARIABLE = LQ
...NOTE..SAMPLE RANGE SET TO: 1, 36
R-SQUARE = 0.9248 R-SQUARE ADJUSTED = 0.9123
VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.35384E-01
STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.18811
SUM OF SQUARED ERRORS-SSE= 1.0615
MEAN OF DEPENDENT VARIABLE = 4.8187
LOG OF THE LIKELIHOOD FUNCTION(IF DEPVAR LOG) = -161.127
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 13.055 5. 2.6109 73.788
ERROR 1.0615 30. 0.35384E-01 P-VALUE
TOTAL 14.116 35. 0.40332 0.000
ANALYSIS OF VARIANCE - FROM ZERO
SS DF MS F
REGRESSION 848.97 6. 141.50 3998.840
ERROR 1.0615 30. 0.35384E-01 P-VALUE
TOTAL 850.04 36. 23.612 0.000
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 30 DF P-VALUE CORR. COEFFICIENT AT MEANS
LUSK 3.5616 0.4827 7.379 0.000 0.803 4.0280 3.5616
LUSK2 -0.32980 0.5725E-01 -5.760 0.000-0.725 -2.6592 -0.3298
LSK 0.11856 1.337 0.8869E-01 0.930 0.016 0.0649 0.1186
LSK2 0.34962 0.3405 1.027 0.313 0.184 0.7038 0.3496
LUSKLSK -0.24465 0.1298 -1.884 0.069-0.325 -0.7430 -0.2447
CONSTANT -3.5082 1.616 -2.171 0.038-0.368 0.0000 -3.5082
In the log-log model with interaction term, the coefficient on the interaction term is not statistically significantly different from zero at the 5% level.
|_* e.) we will cover this part in the lab
2. This data set was used in a lab session in some previous years. Here are the relevant bits of output:
|_* Suppose you have a sample of mid-level managers who have been surveyed
|_* concerning the number of hours per week they spend on work-related activities,
|_* either in the office or at home. (These data are fictional.) The dependent
|_* variable is HOURS (per week, averaged over a three-month period) and the
|_* explanatory variables you are considering are:
|_* FEMALE=1 if female; 0 if male
|_* SPOUSE=1 if married or equivalent; 0 otherwise
|_* SWORK=1 if spouse full-time employed; 0 otherwise
|_sample 1 60
|_read(mgr.dat) hours female spouse swork
|_stat
NAME N MEAN ST. DEV VARIANCE MINIMUM MAXIMUM
HOURS 60 40.924 5.8742 34.506 26.630 52.760
FEMALE 60 0.33333 0.47538 0.22599 0.00000 1.0000
SPOUSE 60 0.76667 0.42652 0.18192 0.00000 1.0000
SWORK 60 0.35000 0.48099 0.23136 0.00000 1.0000
|_* Q# 2a) what is marginal mean number of hours worked by all managers?
|_ols hours
R-SQUARE = 0.0000 R-SQUARE ADJUSTED = 0.0000
VARIANCE OF THE ESTIMATE-SIGMA**2 = 34.506
STANDARD ERROR OF THE ESTIMATE-SIGMA = 5.8742
SUM OF SQUARED ERRORS-SSE= 2035.9
MEAN OF DEPENDENT VARIABLE = 40.924
LOG OF THE LIKELIHOOD FUNCTION = -190.866
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION -0.11823E-10 0. 0.00000 0.000
ERROR 2035.9 59. 34.506 P-VALUE
TOTAL 2035.9 59. 34.506 1.000
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 59 DF P-VALUE CORR. COEFFICIENT AT MEANS
CONSTANT 40.924 0.7584 53.96 0.000 0.990 0.0000 1.0000
Y-bar is about 41 hours if we lump all types of managers together. Note that the R-squared value is zero here, because no regressors are used.
|_* Q# 2b) next see how manager hours depends on gender
|_ols hours female
R-SQUARE = 0.1068 R-SQUARE ADJUSTED = 0.0914
VARIANCE OF THE ESTIMATE-SIGMA**2 = 31.354
STANDARD ERROR OF THE ESTIMATE-SIGMA = 5.5995
SUM OF SQUARED ERRORS-SSE= 1818.5
MEAN OF DEPENDENT VARIABLE = 40.924
LOG OF THE LIKELIHOOD FUNCTION = -187.479
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 217.35 1. 217.35 6.932
ERROR 1818.5 58. 31.354 P-VALUE
TOTAL 2035.9 59. 34.506 0.011
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 58 DF P-VALUE CORR. COEFFICIENT AT MEANS
FEMALE 4.0375 1.533 2.633 0.011 0.327 0.3267 0.0329
CONSTANT 39.578 0.8854 44.70 0.000 0.986 0.0000 0.9671
For the male manager group, FEMALE takes on a value of zero, so mean hours for male managers are just 39.578, with a standard error of 0.8854 hours. For
female managers, the FEMALE variable is always equal to one, so female manager mean hours is the sum of the "intercept" and the "slope" in this model. Specifically, (39.578 + 4.0375). Female managers work, on average, 4.0375 hours more per week than male managers. Is this difference statistically significant? Yes. The P-value on the differences is less than 0.05, so we can reject the hypothesis that the difference (the slope on FEMALE) is zero.
|_* Q# 2c) does marital status affect expected work hours?
|_ols hours female spouse
R-SQUARE = 0.2289 R-SQUARE ADJUSTED = 0.2019
VARIANCE OF THE ESTIMATE-SIGMA**2 = 27.540
STANDARD ERROR OF THE ESTIMATE-SIGMA = 5.2478
SUM OF SQUARED ERRORS-SSE= 1569.8
MEAN OF DEPENDENT VARIABLE = 40.924
LOG OF THE LIKELIHOOD FUNCTION = -183.066
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 466.11 2. 233.06 8.463
ERROR 1569.8 57. 27.540 P-VALUE
TOTAL 2035.9 59. 34.506 0.001
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 57 DF P-VALUE CORR. COEFFICIENT AT MEANS
FEMALE 1.8862 1.606 1.175 0.245 0.154 0.1526 0.0154
SPOUSE -5.3783 1.789 -3.005 0.004-0.370 -0.3905 -0.1008
CONSTANT 44.418 1.812 24.52 0.000 0.956 0.0000 1.0854
If we control for whether a manager is male or female and the look at the average difference in work hours across managers without spouses and managers with spouses, the point estimate of the difference between these two groups is -5.3783, and this
difference is statistically significant. It appears that having a spouse, on average across the sample, lowers your weekly hours by more than 5. When interpreting this regression, the intercept gives means work hours for the group with zero values for both FEMALE and SPOUSE--namely, single male managers. For single female managers, expected work hours is (44.418+1.8862). For married male managers, expected work hours is (44.418-5.3783). For married female managers, expected work hours is (44.418+1.8862-5.3783). Since there are no interaction terms, the effect of FEMALE on expected work hours is the same regardless of whether a manager is married or not. Likewise, absent any interaction terms, the effect of SPOUSE on expected work hours is the same for males and females. However, notice that when you control for marital status the difference between male and female manager hours becomes statistically
insignificant at the 5% level. This means that FEMALE and SPOUSE must be somewhat correlated.
|_* Q# 2d) difference in effect of having a spouse according to whether
|_* the spouse works or not
|_ols hours female spouse swork
R-SQUARE = 0.2562 R-SQUARE ADJUSTED = 0.2163
VARIANCE OF THE ESTIMATE-SIGMA**2 = 27.042
STANDARD ERROR OF THE ESTIMATE-SIGMA = 5.2002
SUM OF SQUARED ERRORS-SSE= 1514.4
MEAN OF DEPENDENT VARIABLE = 40.924
LOG OF THE LIKELIHOOD FUNCTION = -181.988
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 521.52 3. 173.84 6.429
ERROR 1514.4 56. 27.042 P-VALUE
TOTAL 2035.9 59. 34.506 0.001
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 56 DF P-VALUE CORR. COEFFICIENT AT MEANS
FEMALE 0.88742 1.737 0.5108 0.611 0.068 0.0718 0.0072
SPOUSE -6.9729 2.094 -3.330 0.002-0.407 -0.5063 -0.1306
SWORK 2.4060 1.681 1.431 0.158 0.188 0.1970 0.0206
CONSTANT 45.132 1.863 24.22 0.000 0.955 0.0000 1.1028
We only observe whether a spouse works or not if there IS a spouse. If there is no spouse, you should find that the SWORK variable takes on a value of zero,
rather than being undefined. In a sense, the undefined values are set to zero because we have implicitly "multiplied" the SWORK data by the SPOUSE variable. If SPOUSE=1, SWORK equals 1 or zero according to whether or not the spouse is employed. If SPOUSE=0, (0*undefined) is set equal to zero, avoiding the problem of undefined variable values. In this model, the effect of having a spouse on manager hours is the derivative of HOURS with respect to spouse, which equals the coefficient on SPOUSE plus the coefficient on SWORK times SWORK. Conceptually, you want to know what happens to E[HOURS] when SPOUSE goes from zero to one. The answer depends on whether the spouse works or not. If not, SWORK is zero and the answer is "hours fall by 6.9729." If the spouse works, you get not only the -6.9729, but also the
+2.4060 term. Thus, the answer to the question posed is just "+2.4060 hours," although this number is not statistically significantly different from zero. Statistically, there is no difference in the effect of having a spouse according to whether that spouse works or not.
|_* Q# 2e) generate some interesting interaction terms:
|_genr spousef=spouse*female
|_genr sworkf=swork*female
|_ols hours female spouse spousef swork sworkf
R-SQUARE = 0.3501 R-SQUARE ADJUSTED = 0.2900
VARIANCE OF THE ESTIMATE-SIGMA**2 = 24.501
STANDARD ERROR OF THE ESTIMATE-SIGMA = 4.9498
SUM OF SQUARED ERRORS-SSE= 1323.0
MEAN OF DEPENDENT VARIABLE = 40.924
LOG OF THE LIKELIHOOD FUNCTION = -177.936
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 712.84 5. 142.57 5.819
ERROR 1323.0 54. 24.501 P-VALUE
TOTAL 2035.9 59. 34.506 0.000
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 54 DF P-VALUE CORR. COEFFICIENT AT MEANS
FEMALE 6.9590 2.928 2.376 0.021 0.308 0.5632 0.0567
SPOUSE -2.2954 2.673 -0.8587 0.394-0.116 -0.1667 -0.0430
SPOUSEF -14.589 5.839 -2.498 0.016-0.322 -0.9334 -0.0594
SWORK 2.8296 1.750 1.617 0.112 0.215 0.2317 0.0242
SWORKF 6.7337 5.503 1.224 0.226 0.164 0.4128 0.0247
CONSTANT 40.795 2.475 16.48 0.000 0.913 0.0000 0.9969
In this model, the intercept applied to a single male manager. A single female manager, on average, works an amount given by the sum of the intercept and the slope on the FEMALE dummy: (40.795+6.959). For a male with a non-working spouse, expected hours are (40.795-2.2954) although the differential for this male manager having a spouse (-2.2954) is not significantly different from zero. For a male manager, the difference between having a non-working and a working spouse is 2.8296 (although this difference is not statistically significant either. For a female with a non-working spouse, expected hours are (40.795+6.9590-2.2954-14.589). For a female with a working spouse, the coefficients on SWORK and SWORKF must be added, changing the total by (+2.8296+6.7337). Thus, the non-working/working spouse differential for males and females differs by the amount of the estimated coefficient on the interaction term SWORKF. Here, the point estimate is 6.7337 and it is not statistically significantly different from zero.
These data suggest that female managers with non-working spouses lose 14 hours, whereas male managers with non-working spouses lose only about 2 hours. According to the point estimates, a male manager with a working spouse actually works more hours than a male manager with no spouse at all. A female manager with a working spouse still works almost 8 hours less per week than a female manager with no spouse at all. Possible interpretations of results like these could be quite entertaining.
Note that in retrospect, the question might be viewed as a trifle ambiguous. As it is written, the event in question concerns "having a non-working spouse." I should have specified "compared to what": (i) having no spouse at all, in which case the answer would concern the magnitude of the coefficient on SPOUSEF; or (ii.) having a working spouse, in which case the answer would concern the magnitude of the coefficient on SWORK...the way I have interpreted it here.
|_* Problem 3 -----------------------------------------------------------
|_sample 1 208
|_read(credit.dat) year month credit
UNIT 88 IS NOW ASSIGNED TO: credit.dat
3 VARIABLES AND 208 OBSERVATIONS STARTING AT OBS 1
|_stat / pcor
NAME N MEAN ST. DEV VARIANCE MINIMUM MAXIMUM
YEAR 208 85.173 5.0187 25.187 77.000 94.000
MONTH 208 6.4231 3.4744 12.071 1.0000 12.000
CREDIT 208 29671. 7565.5 0.57237E+08 14592. 54943.
CORRELATION MATRIX OF VARIABLES - 208 OBSERVATIONS
YEAR 1.0000
MONTH -0.39128E-01 1.0000
CREDIT 0.85276 0.52290E-01 1.0000
YEAR MONTH CREDIT
|_* a.)
|_genr t=time(0)
|_plot credit t
REQUIRED MEMORY IS PAR= 7 CURRENT PAR= 500
FOR MAXIMUM EFFICIENCY USE AT LEAST PAR= 10
208 OBSERVATIONS
*=CREDIT
M=MULTIPLE POINT
This crummy plot suggests that it would be a good time to do a gnuplot plot, with a nice crisp line connecting the points:
54943. | *
52709. |
50475. |
48241. |
46008. |
43774. | *
41540. |
39306. | * * * * *
37072. | * * MMM * * M
34838. | M * MM* M*MMM M ***
32605. | * MMMMM* *M**MM MMM
30371. | MMM M*
28137. | * M
25903. | M MM*
23669. | * MMM
21435. | *MMM*
19202. | *MMMM
16968. | **MM
14734. |MMMM
12500. |M
________________________________________
0.000 60.000 120.000 180.000 240.000
T
This is how gnuplog works on my stand-alone machine, as opposed to the
format we covered for the PS lab.|_plot credit t / gnu lineonly commfile=cre.gnu datafile=cre.dat & | output=cre.ps"Logical operators" include .eq. .ne. .gt. .lt. .ge. .le. SHAZAM evaluates the expression in the parentheses and executes the associated generate-type command if it is true. Otherwise, the variable is set equal to zero.
![]()
|_if(month.eq.1) jan=1 |_if(month.eq.2) feb=1 |_if(month.eq.3) mar=1 |_if(month.eq.4) apr=1 |_if(month.eq.5) may=1 |_if(month.eq.6) jun=1 |_if(month.eq.7) jul=1 |_if(month.eq.8) aug=1 |_if(month.eq.9) sep=1 |_if(month.eq.10) oct=1 |_if(month.eq.11) nov=1 |_if(month.eq.12) dec=1This is a regression with 11 monthly dummies only.
|_ols credit feb mar apr may jun jul aug sep oct nov dec
REQUIRED MEMORY IS PAR= 52 CURRENT PAR= 500
OLS ESTIMATION
208 OBSERVATIONS DEPENDENT VARIABLE = CREDIT
...NOTE..SAMPLE RANGE SET TO: 1, 208
R-SQUARE = 0.0304 R-SQUARE ADJUSTED = -0.0240
VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.58610E+08
STANDARD ERROR OF THE ESTIMATE-SIGMA = 7655.7
SUM OF SQUARED ERRORS-SSE= 0.11488E+11
MEAN OF DEPENDENT VARIABLE = 29671.
LOG OF THE LIKELIHOOD FUNCTION = -2149.15
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 0.36036E+09 11. 0.32760E+08 0.559
ERROR 0.11488E+11 196. 0.58610E+08 P-VALUE
TOTAL 0.11848E+11 207. 0.57237E+08 0.860
Since the p-value for the F-test of "all slopes simultaneously zero" is
not small enough to reject the hypothesis, the 11 dummies, by themselves are not particularly helpful for explaining the observed variation in retail credit balances.
ANALYSIS OF VARIANCE - FROM ZERO
SS DF MS F
REGRESSION 0.18347E+12 12. 0.15289E+11 260.865
ERROR 0.11488E+11 196. 0.58610E+08 P-VALUE
TOTAL 0.19496E+12 208. 0.93731E+09 0.000
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 196 DF P-VALUE CORR. COEFFICIENT AT MEANS
FEB -1259.7 2552. -0.4936 0.622-0.035 -0.0469 -0.0037
MAR -1611.5 2552. -0.6315 0.528-0.045 -0.0600 -0.0047
APR -1617.3 2552. -0.6338 0.527-0.045 -0.0603 -0.0047
MAY -854.19 2589. -0.3299 0.742-0.024 -0.0310 -0.0024
JUN -1976.8 2589. -0.7635 0.446-0.054 -0.0718 -0.0054
JUL -2066.0 2589. -0.7979 0.426-0.057 -0.0750 -0.0057
AUG -1857.7 2589. -0.7175 0.474-0.051 -0.0674 -0.0051
SEP -1842.2 2589. -0.7115 0.478-0.051 -0.0669 -0.0051
OCT -1574.2 2589. -0.6080 0.544-0.043 -0.0571 -0.0043
NOV -637.66 2589. -0.2463 0.806-0.018 -0.0231 -0.0018
DEC 2907.0 2589. 1.123 0.263 0.080 0.1055 0.0080
CONSTANT 30705. 1804. 17.02 0.000 0.772 0.0000 1.0349
Likewise, none of the individual monthly dummy variables is individually
significant (or, more accurately, none of the coefficients on the individual monthly dummy variables is individually statistically significantly different from zero).|_* b.)Now we have included a linear time trend variable, t, in the model.
|_ols credit feb mar apr may jun jul aug sep oct nov dec t / predict=credthat
REQUIRED MEMORY IS PAR= 56 CURRENT PAR= 500
OLS ESTIMATION
208 OBSERVATIONS DEPENDENT VARIABLE = CREDIT
...NOTE..SAMPLE RANGE SET TO: 1, 208
R-SQUARE = 0.7599 R-SQUARE ADJUSTED = 0.7452
VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.14585E+08
STANDARD ERROR OF THE ESTIMATE-SIGMA = 3819.1
SUM OF SQUARED ERRORS-SSE= 0.28441E+10
MEAN OF DEPENDENT VARIABLE = 29671.
LOG OF THE LIKELIHOOD FUNCTION = -2003.96
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 0.90038E+10 12. 0.75032E+09 51.443
ERROR 0.28441E+10 195. 0.14585E+08 P-VALUE
TOTAL 0.11848E+11 207. 0.57237E+08 0.000
ANALYSIS OF VARIANCE - FROM ZERO
SS DF MS F
REGRESSION 0.19212E+12 13. 0.14778E+11 1013.215
ERROR 0.28441E+10 195. 0.14585E+08 P-VALUE
TOTAL 0.19496E+12 208. 0.93731E+09 0.000
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 195 DF P-VALUE CORR. COEFFICIENT AT MEANS
FEB -1367.1 1273. -1.074 0.284-0.077 -0.0509 -0.0040
MAR -1826.3 1273. -1.435 0.153-0.102 -0.0680 -0.0053
APR -1939.6 1273. -1.524 0.129-0.108 -0.0723 -0.0057
MAY -639.35 1292. -0.4950 0.621-0.035 -0.0232 -0.0018
JUN -1869.4 1292. -1.447 0.149-0.103 -0.0679 -0.0051
JUL -2066.0 1292. -1.600 0.111-0.114 -0.0750 -0.0057
AUG -1965.1 1292. -1.521 0.130-0.108 -0.0713 -0.0054
SEP -2057.1 1292. -1.593 0.113-0.113 -0.0747 -0.0057
OCT -1896.4 1292. -1.468 0.144-0.105 -0.0688 -0.0052
NOV -1067.3 1292. -0.8263 0.410-0.059 -0.0387 -0.0029
DEC 2370.0 1292. 1.835 0.068 0.130 0.0860 0.0065
T 107.42 4.413 24.34 0.000 0.867 0.8546 0.3783
CONSTANT 19641. 1008. 19.48 0.000 0.813 0.0000 0.6620
Things are quite a bit different. In particular, the F-test for the joint significance of all of the slopes (now including that on t) strongly rejects the hypothesis that "none of the explanatory variables matters". Individually, the monthly dummy variables are not statistically significant at the 5% level but the December dummy coefficient has a positive, as opposed to a negative, point estimate and is significant at the 10% level (and even at the 6.8% level).|_* c.)Now consider a model that is quadratic in "time."
|_genr t2=t*t
|_ols credit t t2 / coef=b
REQUIRED MEMORY IS PAR= 40 CURRENT PAR= 500
OLS ESTIMATION
208 OBSERVATIONS DEPENDENT VARIABLE = CREDIT
...NOTE..SAMPLE RANGE SET TO: 1, 208
R-SQUARE = 0.8948 R-SQUARE ADJUSTED = 0.8938
VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.60807E+07
STANDARD ERROR OF THE ESTIMATE-SIGMA = 2465.9
SUM OF SQUARED ERRORS-SSE= 0.12465E+10
MEAN OF DEPENDENT VARIABLE = 29671.
LOG OF THE LIKELIHOOD FUNCTION = -1918.17
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 0.10601E+11 2. 0.53007E+10 871.723
ERROR 0.12465E+10 205. 0.60807E+07 P-VALUE
TOTAL 0.11848E+11 207. 0.57237E+08 0.000
Since t was individually significant, even in conjunction with the set of
dummies, it is not surprising that t in conjunction with just t2 will jointly be significant.
ANALYSIS OF VARIANCE - FROM ZERO
SS DF MS F
REGRESSION 0.19371E+12 3. 0.64571E+11 10618.964
ERROR 0.12465E+10 205. 0.60807E+07 P-VALUE
TOTAL 0.19496E+12 208. 0.93731E+09 0.000
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 205 DF P-VALUE CORR. COEFFICIENT AT MEANS
T 304.26 11.44 26.59 0.000 0.880 2.4206 1.0716
T2 -0.94081 0.5303E-01 -17.74 0.000 -0.778 -1.6151 -0.4606
CONSTANT 11541. 517.9 22.28 0.000 0.841 0.0000 0.3890
Both the linear AND the quadratic terms in t are individually strongly significant. There is strong evidence of curvature in the relationship between credit and time. Furthermore, since the coefficient on t squared is negative, we know that the slope is decreasing over time, so the quadratic shape opens downwards and there may be a maximum value of fitted credit somewhere within the sample range of time (t) values.
|_* d.)
|_gen1 tstar=-(b:1)/(2*b:2)
|_print tstar
TSTAR
161.7017
We know that the t variables ranges from 1 to 208 in the sample (there are 208 observations) so the peak of fitted credit occurs between months 161 and 162. If you used PRINT YEAR MONTH T, you could look for the year and month that corresponds to t=161 and see whether this is somewhere around the 1986 change in tax law concerning the deductibility of retail credit interest payments.|_ols credit t t2 feb mar apr may jun jul aug sep oct nov dec / coef=bbNow control for seasonality as well as a curvilinear time trend.
REQUIRED MEMORY IS PAR= 59 CURRENT PAR= 500
OLS ESTIMATION
208 OBSERVATIONS DEPENDENT VARIABLE = CREDIT
...NOTE..SAMPLE RANGE SET TO: 1, 208
Look at that R-squared value now!
R-SQUARE = 0.9218 R-SQUARE ADJUSTED = 0.9166
VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.47746E+07
STANDARD ERROR OF THE ESTIMATE-SIGMA = 2185.1
SUM OF SQUARED ERRORS-SSE= 0.92627E+09
MEAN OF DEPENDENT VARIABLE = 29671.
LOG OF THE LIKELIHOOD FUNCTION = -1887.29
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 0.10922E+11 13. 0.84013E+09 175.958
ERROR 0.92627E+09 194. 0.47746E+07 P-VALUE
TOTAL 0.11848E+11 207. 0.57237E+08 0.000
ANALYSIS OF VARIANCE - FROM ZERO
SS DF MS F
REGRESSION 0.19403E+12 14. 0.13860E+11 2902.759
ERROR 0.92627E+09 194. 0.47746E+07 P-VALUE
TOTAL 0.19496E+12 208. 0.93731E+09 0.000
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 194 DF P-VALUE CORR. COEFFICIENT AT MEANS
T 304.61 10.16 29.99 0.000 0.907 2.4233 1.0728
T2 -0.94348 0.4708E-01 -20.04 0.000-0.821 -1.6197 -0.4619
FEB -1369.0 728.4 -1.880 0.062-0.134 -0.0510 -0.0040
MAR -1828.2 728.4 -2.510 0.013 -0.177 -0.0681 -0.0053
APR -1939.6 728.4 -2.663 0.008 -0.188 -0.0723 -0.0057
MAY -1026.2 739.3 -1.388 0.167-0.099 -0.0372 -0.0028
JUN -2261.8 739.3 -3.060 0.003 -0.215 -0.0821 -0.0062
JUL -2462.2 739.3 -3.331 0.001 -0.233 -0.0894 -0.0068
AUG -2363.2 739.3 -3.197 0.002 -0.224 -0.0858 -0.0065
SEP -2455.2 739.3 -3.321 0.001 -0.232 -0.0891 -0.0068
OCT -2292.7 739.3 -3.101 0.002 -0.217 -0.0832 -0.0063
NOV -1459.8 739.3 -1.975 0.050 -0.140 -0.0530 -0.0040
DEC 1983.1 739.4 2.682 0.008 0.189 0.0720 0.0055
CONSTANT 12997. 665.4 19.53 0.000 0.814 0.0000 0.4380
Retail credit balances are not statistically different from the January level in February or in May, but the are statistically significantly lower in all other months, except December, where the holiday shopping effect seems to come through loud and clear. The higher balances in May might be being driven by the outlier in May of 1997 (and supported by people using credit cards to cover their income tax payments in other years as well).
|_gen1 tstar2=-(bb:1)/(2*bb:2)
|_print tstar2
TSTAR2
161.4265
The historical fitted peak month for credit balances is again around month 161-162.
|_genr t3=t*t*t
|_ols credit t t2 t3 feb mar apr may jun jul aug sep oct nov dec / coef=bb
REQUIRED MEMORY IS PAR= 63 CURRENT PAR= 500
OLS ESTIMATION
208 OBSERVATIONS DEPENDENT VARIABLE = CREDIT
...NOTE..SAMPLE RANGE SET TO: 1, 208
R-SQUARE = 0.9294 R-SQUARE ADJUSTED = 0.9243
VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.43347E+07
STANDARD ERROR OF THE ESTIMATE-SIGMA = 2082.0
SUM OF SQUARED ERRORS-SSE= 0.83659E+09
MEAN OF DEPENDENT VARIABLE = 29671.
LOG OF THE LIKELIHOOD FUNCTION = -1876.70
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 0.11011E+11 14. 0.78653E+09 181.450
ERROR 0.83659E+09 193. 0.43347E+07 P-VALUE
TOTAL 0.11848E+11 207. 0.57237E+08 0.000
ANALYSIS OF VARIANCE - FROM ZERO
SS DF MS F
REGRESSION 0.19412E+12 15. 0.12942E+11 2985.585
ERROR 0.83659E+09 193. 0.43347E+07 P-VALUE
TOTAL 0.19496E+12 208. 0.93731E+09 0.000
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 193 DF P-VALUE CORR. COEFFICIENT AT MEANS
T 203.02 24.34 8.341 0.000 0.515 1.6152 0.7151
T2 0.26849 0.2702 0.9937 0.322 0.071 0.4609 0.1314
T3 -0.38659E-02 0.8499E-03 -4.549 0.000-0.311 -1.3164 -0.2960
FEB -1349.1 694.0 -1.944 0.053-0.139 -0.0503 -0.0039
MAR -1788.4 694.1 -2.577 0.011-0.182 -0.0666 -0.0052
APR -1879.9 694.2 -2.708 0.007-0.191 -0.0700 -0.0055
MAY -1049.0 704.4 -1.489 0.138-0.107 -0.0381 -0.0029
JUN -2269.6 704.4 -3.222 0.001-0.226 -0.0824 -0.0063
JUL -2454.9 704.4 -3.485 0.001-0.243 -0.0891 -0.0068
AUG -2340.9 704.4 -3.323 0.001-0.233 -0.0850 -0.0064
SEP -2417.9 704.4 -3.432 0.001-0.240 -0.0878 -0.0067
OCT -2240.3 704.5 -3.180 0.002-0.223 -0.0813 -0.0062
NOV -1392.4 704.6 -1.976 0.050-0.141 -0.0505 -0.0038
DEC 2065.7 704.7 2.931 0.004 0.206 0.0750 0.0057
CONSTANT 14759. 743.0 19.86 0.000 0.819 0.0000 0.4974
Adding the cubed term makes the squared term statistically insignificant, so there is probably some collinearity between them. This could be verified with another STAT / PCOR command. The coefficient implies that the slope of the time trend in decreasing at a decreasing rate as time passes. A useful execise at this point would be to use the fitted coefficients on the t terms to plot the shape of a cubic function with these parameters.
|_* e.)
|_ols credit t t2 feb mar apr may jun jul aug sep oct nov dec / coef=bb
REQUIRED MEMORY IS PAR= 61 CURRENT PAR= 500
OLS ESTIMATION
208 OBSERVATIONS DEPENDENT VARIABLE = CREDIT
...NOTE..SAMPLE RANGE SET TO: 1, 208
R-SQUARE = 0.9218 R-SQUARE ADJUSTED = 0.9166
VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.47746E+07
STANDARD ERROR OF THE ESTIMATE-SIGMA = 2185.1
SUM OF SQUARED ERRORS-SSE= 0.92627E+09
MEAN OF DEPENDENT VARIABLE = 29671.
LOG OF THE LIKELIHOOD FUNCTION = -1887.29
ANALYSIS OF VARIANCE - FROM MEAN
SS DF MS F
REGRESSION 0.10922E+11 13. 0.84013E+09 175.958
ERROR 0.92627E+09 194. 0.47746E+07 P-VALUE
TOTAL 0.11848E+11 207. 0.57237E+08 0.000
ANALYSIS OF VARIANCE - FROM ZERO
SS DF MS F
REGRESSION 0.19403E+12 14. 0.13860E+11 2902.759
ERROR 0.92627E+09 194. 0.47746E+07 P-VALUE
TOTAL 0.19496E+12 208. 0.93731E+09 0.000
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 194 DF P-VALUE CORR. COEFFICIENT AT MEANS
T 304.61 10.16 29.99 0.000 0.907 2.4233 1.0728
T2 -0.94348 0.4708E-01 -20.04 0.000-0.821 -1.6197 -0.4619
FEB -1369.0 728.4 -1.880 0.062-0.134 -0.0510 -0.0040
MAR -1828.2 728.4 -2.510 0.013-0.177 -0.0681 -0.0053
APR -1939.6 728.4 -2.663 0.008-0.188 -0.0723 -0.0057
MAY -1026.2 739.3 -1.388 0.167-0.099 -0.0372 -0.0028
JUN -2261.8 739.3 -3.060 0.003-0.215 -0.0821 -0.0062
JUL -2462.2 739.3 -3.331 0.001-0.233 -0.0894 -0.0068
AUG -2363.2 739.3 -3.197 0.002-0.224 -0.0858 -0.0065
SEP -2455.2 739.3 -3.321 0.001-0.232 -0.0891 -0.0068
OCT -2292.7 739.3 -3.101 0.002-0.217 -0.0832 -0.0063
NOV -1459.8 739.3 -1.975 0.050-0.140 -0.0530 -0.0040
DEC 1983.1 739.4 2.682 0.008 0.189 0.0720 0.0055
CONSTANT 12997. 665.4 19.53 0.000 0.814 0.0000 0.4380
Since we have other variables besides just the dummies, we cannot use the
automatic F-test that is produced for every SHAZAM ols regression in the Analysis of Variance from Means table. We need to do a special F-test that asks whether just the dummy variable coefficients could be jointly zero. You could always do this the old-fashioned way by doing both this unrestricted regression and the restricted regression corresponding to the null hypothesis being true, then check the ANOVA tables for both to find the ingredients for constructing this F-test statistic yourself.|_test |_test feb=0 |_test mar=0 |_test apr=0 |_test may=0 |_test jun=0 |_test jul=0 |_test aug=0 |_test sep=0 |_test oct=0 |_test nov=0 |_test dec=0 |_endThe null hypothesis is soundly rejected.
F STATISTIC = 6.0980951 WITH 11 AND 194 D.F. P-VALUE= 0.00000 WALD CHI-SQUARE STATISTIC = 67.079046 WITH 11 D.F. P-VALUE= 0.00000 UPPER BOUND ON P-VALUE BY CHEBYCHEV INEQUALITY = 0.16399 |_* f.) just interpretation. Huge tax bills came due on April 15, 1987, for |_* 1986 tax year. People realized retail credit interest was no longer |_* deductible. It was more expensive, so less of it began to take place. |_* g.)Using regression to net out expected seasonal differences in credit
Note that we are now suppressing the intercept term and using the full set of 12 dummy variables, so each coefficient is now an "intercept" for each month, as opposed to the regular intercept being the January value and the dummy variable coefficients being the differentials between January and each other month. |_ols credit jan feb mar apr may jun jul aug sep oct nov dec / noconstant resid=eA key feature of this regression is that we save, for each observation, the fitted error (the amount by which that observation differs from what we would expect for a january or a february, etc.).
REQUIRED MEMORY IS PAR= 59 CURRENT PAR= 500
OLS ESTIMATION
208 OBSERVATIONS DEPENDENT VARIABLE = CREDIT
...NOTE..SAMPLE RANGE SET TO: 1, 208
R-SQUARE = 0.0304 R-SQUARE ADJUSTED = -0.0240
VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.58610E+08
STANDARD ERROR OF THE ESTIMATE-SIGMA = 7655.7
SUM OF SQUARED ERRORS-SSE= 0.11488E+11
MEAN OF DEPENDENT VARIABLE = 29671.
LOG OF THE LIKELIHOOD FUNCTION = -2149.15
RAW MOMENT R-SQUARE = 0.9411
ANALYSIS OF VARIANCE - FROM ZERO
SS DF MS F
REGRESSION 0.18347E+12 12. 0.15289E+11 260.865
ERROR 0.11488E+11 196. 0.58610E+08 P-VALUE
TOTAL 0.19496E+12 208. 0.93731E+09 0.000
All the individual dummy coefficients are now statistically significantly
different from zero, since we are asking if each monthly average credit balance could be zero, NOT whether each month's balance differs from the usual January balance.VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY NAME COEFFICIENT ERROR 196 DF P-VALUE CORR. COEFFICIENT AT MEANS JAN 30705. 1804. 17.02 0.000 0.772 1.1438 0.0896 FEB 29445. 1804. 16.32 0.000 0.759 1.0969 0.0859 MAR 29093. 1804. 16.12 0.000 0.755 1.0838 0.0849 APR 29088. 1804. 16.12 0.000 0.755 1.0836 0.0848 MAY 29851. 1857. 16.08 0.000 0.754 1.0835 0.0822 JUN 28728. 1857. 15.47 0.000 0.741 1.0428 0.0791 JUL 28639. 1857. 15.42 0.000 0.740 1.0395 0.0789 AUG 28847. 1857. 15.54 0.000 0.743 1.0471 0.0795 SEP 28863. 1857. 15.54 0.000 0.743 1.0477 0.0795 OCT 29131. 1857. 15.69 0.000 0.746 1.0574 0.0802 NOV 30067. 1857. 16.19 0.000 0.756 1.0914 0.0828 DEC 33612. 1857. 18.10 0.000 0.791 1.2201 0.0926We then calculate the overall mean credit balance over the entire series, use that as the baseline, and then add back in the "unexpected" amount of credit associated with each observation, controlling for what usually happens in each month of the year.
|_stat credit / mean=mcredit NAME N MEAN ST. DEV VARIANCE MINIMUM MAXIMUM CREDIT 208 29671. 7565.5 0.57237E+08 14592. 54943. |_genr creditsa=mcredit+eAnother plot that will look much nicer as a gnuplot line plot.
|_plot credit creditsa t
REQUIRED MEMORY IS PAR= 37 CURRENT PAR= 500
FOR MAXIMUM EFFICIENCY USE AT LEAST PAR= 42
208 OBSERVATIONS
*=CREDIT
+=CREDITSA
M=MULTIPLE POINT
54943. | *
52709. | +
50475. |
48241. |
46008. |
43774. | *
41540. |
39306. | * + * * * *
37072. | * * MMM * * M
34838. | M M+MMM MMMMM M *M*
32605. | * MMMMM* *M**MM+MMM
30371. | MMM MM
28137. | *MM
25903. | MMMM*
23669. | *MMMM
21435. | *MMMM
19202. | *MMMM
16968. | *MMM
14734. |MMMM
12500. |M+
________________________________________
0.000 60.000 120.000 180.000 240.000
T
|_plot credit creditsa t / gnu lineonly commfile=cred.gnu datafile=cred.dat &
| output=cred.ps
See if the seasonal adjustment process is being unduly influenced by the
extreme outlier in May of 1987.
|_* try deleting influential outlier in May 1987...
|_skipif(credit.gt.50000)
OBSERVATION 125 WILL BE SKIPPED
|_ols credit jan feb mar apr may jun jul aug sep oct nov dec / noconstant resid=e
REQUIRED MEMORY IS PAR= 63 CURRENT PAR= 500
OLS ESTIMATION
207 OBSERVATIONS DEPENDENT VARIABLE = CREDIT
...NOTE..SAMPLE RANGE SET TO: 1, 208
R-SQUARE = 0.0346 R-SQUARE ADJUSTED = -0.0199
VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.55480E+08
STANDARD ERROR OF THE ESTIMATE-SIGMA = 7448.5
SUM OF SQUARED ERRORS-SSE= 0.10819E+11
MEAN OF DEPENDENT VARIABLE = 29549.
LOG OF THE LIKELIHOOD FUNCTION = -2133.10
RAW MOMENT R-SQUARE = 0.9436
ANALYSIS OF VARIANCE - FROM ZERO
SS DF MS F
REGRESSION 0.18112E+12 12. 0.15094E+11 272.053
ERROR 0.10819E+11 195. 0.55480E+08 P-VALUE
TOTAL 0.19194E+12 207. 0.92725E+09 0.000
VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY
NAME COEFFICIENT ERROR 195 DF P-VALUE CORR. COEFFICIENT AT MEANS
JAN 30705. 1756. 17.49 0.000 0.781 1.1759 0.0904
FEB 29445. 1756. 16.77 0.000 0.768 1.1276 0.0867
MAR 29093. 1756. 16.57 0.000 0.765 1.1142 0.0856
APR 29088. 1756. 16.57 0.000 0.765 1.1139 0.0856
MAY 28282. 1862. 15.19 0.000 0.736 1.0265 0.0740
JUN 28728. 1807. 15.90 0.000 0.751 1.0720 0.0798
JUL 28639. 1807. 15.85 0.000 0.750 1.0687 0.0796
AUG 28847. 1807. 15.97 0.000 0.753 1.0764 0.0802
SEP 28863. 1807. 15.98 0.000 0.753 1.0770 0.0802
OCT 29131. 1807. 16.13 0.000 0.756 1.0870 0.0810
NOV 30067. 1807. 16.64 0.000 0.766 1.1220 0.0836
DEC 33612. 1807. 18.61 0.000 0.800 1.2542 0.0934
|_stat credit / mean=mcredit
NAME N MEAN ST. DEV VARIANCE MINIMUM MAXIMUM
CREDIT 207 29549. 7375.6 0.54399E+08 14592. 43833.
|_genr creditsa=mcredit+e
|_plot credit creditsa t
REQUIRED MEMORY IS PAR= 39 CURRENT PAR= 500
FOR MAXIMUM EFFICIENCY USE AT LEAST PAR= 44
207 OBSERVATIONS
*=CREDIT
+=CREDITSA
M=MULTIPLE POINT
43833. | *
41921. |
40008. | * *
38096. | * MMM * * * *
36184. | * * * MMM MMM * **M
34272. | *+MMMM* MMMMM+M+*M
32359. | *+MMMMM M MMM
30447. | MMM MM
28535. | *+M
26623. | * *MM
24710. | MMMM
22798. | * MMM*
20886. | *MMM
18974. | *MMMM
17061. | *MMM
15149. |MMMM
13237. |MM
11325. |
9412.3 |
7500.0 |
________________________________________
0.000 60.000 120.000 180.000 240.000
T
Or, a much cleaner plot by gnuplot: |_plot credit creditsa t / gnu lineonly commfile=cre1.gnu datafile=cre1.dat & | output=cre1.ps
![]()