vrijdag 12 mei 2017

Simulation panel-invariant variable: OLS versus fixed effects

Below is a simulation of the consequence of controlling for fixed effects. In some cases this is desirable, in others not.

In the wage equation below, the wage depends on gender (b = 2), effort (b = 10), and ability (b = 3). There is no unobserved heterogeneity. A straightforward OLS estimation will return the correct b's.

However, as gender and ability are panel-invariant, the fixed effect regression does not yield any effect -- although the effect of effort is unbiased.

On the other hand, say that ability and effort are correlated, and ability is not observed, the estimated beta if effort in OLS would be biased, but not in a fixed effects estimation. The latter, however, would not allow estimating the beta for gender.

Good riddance.



clear
gene byte gender = .
gene float effort = .
gene float wage = .
gene float ability = .
gene long id = .

forvalues i = 1/1000 {
local a = runiform()
local g = runiform() > .5

set obs `=`i'*5'

replace ability = `a' if missing(id)
replace gender = `g' if missing(id)
replace id = `i' if missing(id)
}
replace effort = runiform()
replace wage = 2*gender + 10*effort + 3*ability + .5*runiform()


regress wage gender effort ability
areg wage gender effort ability, abs(id)


/*
. regress wage gender effort ability

      Source |       SS           df       MS      Number of obs   =     5,000
-------------+----------------------------------   F(3, 4996)      >  99999.00
       Model |  50282.7806         3  16760.9269   Prob > F        =    0.0000
    Residual |  104.339671     4,996  .020884642   R-squared       =    0.9979
-------------+----------------------------------   Adj R-squared   =    0.9979
       Total |  50387.1202     4,999  10.0794399   Root MSE        =    .14452

------------------------------------------------------------------------------
        wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      gender |   1.999089   .0040916   488.58   0.000     1.991067     2.00711
      effort |   10.00101   .0070513  1418.33   0.000     9.987187    10.01483
     ability |   3.004343   .0071066   422.75   0.000     2.990411    3.018275
       _cons |   .2498592   .0058902    42.42   0.000     .2383118    .2614065
------------------------------------------------------------------------------


end of do-file

. do "/var/folders/p2/2v2ckxtd2794655ypfbmfg9w0000gn/T//SD31177.000000"

. areg wage gender effort ability, abs(id)
note: gender omitted because of collinearity
note: ability omitted because of collinearity

Linear regression, absorbing indicators         Number of obs     =      5,000
                                                F(   1,   3999)   = 1602779.17
                                                Prob > F          =     0.0000
                                                R-squared         =     0.9983
                                                Adj R-squared     =     0.9979
                                                Root MSE          =     0.1450

------------------------------------------------------------------------------
        wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      gender |          0  (omitted)
      effort |   10.00054   .0078993  1266.01   0.000     9.985056    10.01603
     ability |          0  (omitted)
       _cons |   2.810143   .0044419   632.65   0.000     2.801435    2.818852
-------------+----------------------------------------------------------------
          id |      F(999, 3999) =    401.808   0.000        (1000 categories)
*/