vrijdag 25 april 2014

Variance eaten up by extreme means (truncation)

If scales are continuous and limitless, variance and means are two different things. If scales are limited, however, this is not the case: towards the bounds, there will be substantially less variation than toward the center of the scale because of censoring.

The issue arose when we wanted to see whether there was divergence or convergence of job quality in Europe. If the scale was wide, the evolution of the variance would tell this, but if the scale is limited and the average moves towards one of the bounds, the variance will be wrongly considered to indicate convergence.

As a solution, I would compute some kind of one-tail variance on the longest tail if the other is strongly censored, hence assuming the latent distribution is symmetric. I have not seen such a measure yet, but it is easy to calculate.

In a way, it is a measure that should be possible to derive from truncated regression. It would be nice to do that.