dinsdag 2 januari 2018

Betas


Are betas from different regressions comparable? How substantial is a significant effect? Is a zero-effect always the benchmark? These are hard questions, and I have no good answer. The following post will address the difference between algebra and estimation.

Suppose we measure inequality in a region as the difference of a percentile p to the median (in logs). An explanation would be the relative impact of the minimum wage, i.e. its difference to the median. We then have:

p - p50 = fc(mw - p50)

Suppose the function is lineair, so that:

p - p50 = cons + b*(mw - p50) + e

Now suppose the wage distribution is normal with mean M and standard deviation s:

p = M + s*f^(-1)(p)

note that p50 = M in this case (the normal distribution is symmetric)

substitute this in the function above to get:

s*f^(-1)(p) =  cons + b*(mw - p50) + e

and therefore

b = [s*f^(-1)(p) - cons - e] / [mw - p50]

Which may lead to the surprising finding that b is determined and its scale depends on f^(-1), the inverted cumulative normal distribution. This is flawed, because b and e are determined at the same time in an estimation. Using OLS, b will be such that the sum of all squared errors is minimal. The betas are therefore not an algebraic given. Yet, annoyingly, my estimations give betas that are quite in line with f^(-1). This is one thing I don't understand.