maandag 6 februari 2017

Making sense of measuring effect size (eta-squared)

Introduction: it's impossible

We like to know whether effects are significant, or more significant than others, but generally avoid the issue of whether effects are substantial. On rare occasions, when an OLS regression suffices, a statistician may want to report R-squared values. However, do not expect much enthusiasm, as R-squared can only increase when adding additional variables even if they do not make sense. Adjusted R-squared, AIC and BIC measures correct for this. Also, R-squared has to be computed differently depending on whether or not there is a constant in the model (see Hayashi, 2000, p. 20). Worse, if your model is not OLS, you will need to define some pseudo-R-squared.

Needless to say, if you want an R-squared value for any subset of explanatory factors, statisticians will hate you. This is not possible, because of collinearity, they will say. You can stepwise add or delete variables and check the change in R-squared, but this will include or remove variance that is due to other variables. Sadly!

Objection: it may be possible

The people at UCLA, in fact Philip B. Ender, thought differently. They have a Stata package called -regeffectsize- for grabs which partitions the variance explained by the separate explanatory factors, giving a good indication of how substantial effects are. The help-file is not very instructive, so I deciphered the .ado myself, and here's the report. I'm sorry this is not (yet) in LaTeX, but speed over quality for now.

The structure of the program is as follows:
  • From the regression, they take:
    • TSS, total sum of squares in Y, which is RSS+MSS
    • RSS: residual sum of squares
    • MSS: the model sum of squares
    • DFR: residual degrees of freedom, which is n-k (n: sample size, k: number of variables in the model)
    • If you divide RSS by DFR you get the mean summed error MSE
  • From the F-test (simply -test-), they take:
    • The F-value
Obviously, eta-squared as we know it is ETA2 = MSS/TSS and it is generally decomposed in a within and between effect. Typically we get an ETA2 for the second level of a multi-level model, to indicate how much variation exists between versus within the second level units. This is also one formulation of the intraclass correlation coefficient ICC.

So how do they get to the (semi)partial eta-squared formula? There are three formulations, which are more or less the same and build up. The basis is the effect sum of squares, which is ESS = F*MSE.

Semipartial eta-squared

ETA2 = ESS/TSS = F*MSE/TSS

A glimpse of genius, if you ask me, to use a distribution unit (F) into the equation for the partitioning of variance. You might think that if F is really key, then our partitioning will be little more than a comparison of significance levels. The higher ESS, the more important the explanatory variable – for this it needs to be significant. In Hayashi, 2000, p. 53, I found some intuition that may be helpful. The F-test is a monotone transformation of the likelihood ratio.

Recall that MSE = RSS/DFR = RSS/(n-k) and add that F = (n-k)*(RSSr/RSSu-1) when there is only one test variable (#r = 1) to compare with the null. It is easy to see that n-k drops from the RHS and we are left with:

ETA2 = [ RSS*(RSSr-RSSu)/RSSu ] / TSS

RSSu are the unrestricted errors when beta is as calculated. Hence RSSu = RSS (the model residual sum of squares). RSSr are the restricted errors under the null (i.e. beta = 0). We can rewrite the equation to:

ETA2 = (RSSr-RSS)/TSS

Say there is only one variable in the model, then RSSr = TSS, and ETA2 = R2 = MSS/TSS. In general, the better the prediction, the lower RSS will be and the higher ETA2. Note that by definition RSS > RSSr, unless there is nothing explained, and then ETA2 = 0. What matters is the difference if we 'exclude' a variable: then RSSr rises proportionally to the importance of that variable.

Is this a good metric? I would tend to think so. The advantage of this approach in my view is that the beta's of other variables are unbiased by the omission of one variable, so MSS is unchanged and RSSr will be correct. There is also no difference in DFR in the models to compare, as it would be with stepwise deletion. To me, this looks useful as an approach to explain how much of total variance is explained by one explanatory variable, although I may overlook something. In that case, let me know.

Oh, I don't know why it would be semi-partial – the .ado uses the macro `spe2', so it's just a guess.

Percentage eta-squared

%ETA2 = 100*ETA2/R2

I think that is obvious enough. The output says 'change eta-squared', but there is no change involved.

Partial eta-squared

Here we have a little change, to not include the variance explained by other variables in the denominator. This will always be larger than the semipartial eta-squared, and will be exponentially higher when ESS approaches MSS. It could be a solution to avoid the issue of having much shared variance in MSS which is not in ESS, I'm not sure.

Part.ETA2 = ESS / (ESS+RSS)

References