w**s 发帖数: 26 | 1 如果用某变量的continuous observations 的regression结果weak, 那么用此变量的
rank observations 的结果是更弱还是会增强;比如x 有100个obs,现在把它rank 成十
组,用它的rank 0,1,..,9作independent variable, the significance of the
coefficient 是增强,还是减弱? | s*******n 发帖数: 901 | | c*****n 发帖数: 46 | 3 这么做有没有什么实际的意义?
【在 w**s 的大作中提到】 : 如果用某变量的continuous observations 的regression结果weak, 那么用此变量的 : rank observations 的结果是更弱还是会增强;比如x 有100个obs,现在把它rank 成十 : 组,用它的rank 0,1,..,9作independent variable, the significance of the : coefficient 是增强,还是减弱?
| I*****a 发帖数: 5425 | 4 This question can be potentially useful. I don't know the answer.
A much simplified and somehow similar problem may be as follows:
We have a single continuous predictor x from some uniform distribution and a
response variable EY = b0 + b1*x. Instead of ranking x into bins, we round
x to w, which is usually assumed in real situations where exact measurement
is difficult.
a) we estimate b1-hat
b) we estimate d1-hat from EY = d0 + d1 * w
In this case, I think the predictor effect is less significant in b) than in
a), especially you round too much (few number of bins). By assuming x = w +
gamma,
The t stat for b1-hat is proportional to Cov(x, Y) / sqrt(S(x) * mse_a),
where S() is sample variance and mse is estimated MSE from the model. The
denominator of above for a) is smaller than b), so the slope effect is more
significant in the original problem than after rounding.
Again, here I simplified the problem a lot, including assuming simple linear
regression, uniform x (not too weird dsn of x), and rounding instead of
ranking. Not sure how relevant this is to the original question.
【在 w**s 的大作中提到】 : 如果用某变量的continuous observations 的regression结果weak, 那么用此变量的 : rank observations 的结果是更弱还是会增强;比如x 有100个obs,现在把它rank 成十 : 组,用它的rank 0,1,..,9作independent variable, the significance of the : coefficient 是增强,还是减弱?
|
|