问一个关于linear regression的error假设问题 - Statistics版

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

Statistics版 - 问一个关于linear regression的error假设问题

相关主题
● 请教LINEAR REGRESSION基本问题	● ordinary linear regression assume数据是Normal distribution么？
● 问个GENERALIZED线性模型中，DESIGN MATRIX变动的影响	● Standard Errors Calculation
● 如何确定什么情况time series，什么情况linear reg？	● any one did EM to find MLE of mixed effects model in R
● 如果dep variable严重skewed，如何做ordinal regression？	● Problem with Maximum Likelihood Estimation
● Regression中噪音项是一个AR(1)，如何做MLE或者其它Fit？	● SAS Logistic Regression算出来的Estimated Probability of Default太小
● Regression model 不用 test normality？	● Linear Regression
● Linear regression model 问题请教	● 求教：SAS怎样实现限制必须经过某个点的GLM
● 请教， linear regression model问题	● help with R

相关话题的讨论汇总
话题: ols话题: regression话题: linear话题: estimator话题: error

进入Statistics版参与讨论

(共1页)

m*****8
发帖数: 27

对于simple linear regression,y(i)=E(Y|X=x(i)).
One of the assumptions concerning the errors is
E(e(i)|x(i))=0, so if we draw a scatteplot of the e(i) versus x(i),we would
have null scatterplot, with no patterns.
问题是为什么要做这样一个假设，E(e(i)|x(i))=0说明e(i)和x(i)没有correlation吗
？如果是的话，怎么推出来的呢？谢谢!

w********e
发帖数: 944

In the simple linear regression, the predictor X is considered as a constant
. For any level of X, X(i), the response variable Y(i) is a random variable
with mean dependent on X(i).

m*****8
发帖数: 27

But what about e(i),could you explain about E(e(i)|x),what this implies?
Thanks!

constant
variable

【在 w********e 的大作中提到】

: In the simple linear regression, the predictor X is considered as a constant
: . For any level of X, X(i), the response variable Y(i) is a random variable
: with mean dependent on X(i).

s**c
发帖数: 1247

I think this implies e(i) is independent with X
no matter what value X is, E[e]===0
so there should be no pattern when plotting e vs. X
also, the variance should not be affected either.

【在 m*****8 的大作中提到】

: But what about e(i),could you explain about E(e(i)|x),what this implies?
: Thanks!
:
: constant
: variable

l*******f
发帖数: 243

If there is a pattern in your residue plot, the order of the fitted model is
possibly not enough, and a higher order model or transformation is needed.
There are other remedy methods. Choosing the appropriate one depends on your
specific situation

w********e
发帖数: 944

What is e(i)? e(i) = y(i)-yhat(i). y(i) and yhat(i) are r.vs for the level
of x in the ith trial, which is considered as constant. Therefore, e(i) is a
r.v. for the level of x in the ith trial. E(e(i)) = E(y(i)-yhat(i)) = beta0
+ beta1 *x(i) - beta0 - beta1 * x(i) = 0.
The point is, x (i) is considered as a constant instead of a r.v.

【在 m*****8 的大作中提到】

: But what about e(i),could you explain about E(e(i)|x),what this implies?
: Thanks!
:
: constant
: variable

m*****8
发帖数: 27

You made a mistake,i doesn't stand for ith trial. It stands for ith
observation or ith case within one sampling.

a
beta0

【在 w********e 的大作中提到】

: What is e(i)? e(i) = y(i)-yhat(i). y(i) and yhat(i) are r.vs for the level
: of x in the ith trial, which is considered as constant. Therefore, e(i) is a
: r.v. for the level of x in the ith trial. E(e(i)) = E(y(i)-yhat(i)) = beta0
: + beta1 *x(i) - beta0 - beta1 * x(i) = 0.
: The point is, x (i) is considered as a constant instead of a r.v.

f*******r
发帖数: 257

We need to distinguish between the error term and the regression residual.
The regression residual is, by design, uncorrelated with the X's. On the
other hand, we need to assume that the error term is uncorrelated with the X
's to have an unbiased estimator. These two are different concepts.

w********e
发帖数: 944

why not grab a book on linear regression. Read the first chapter on simple
linear regression carefully. You amy have a clear understanding on the basic
concepts by then.

c*****w
发帖数: 50

I think E(e|x)=0 does not necessarily mean e is uncorrelated with x. For OLS
, the assumption is that e|x is normal,i.e. N(0,sigma^2I), which implies
that e and x are independent.

would

【在 m*****8 的大作中提到】

: 对于simple linear regression,y(i)=E(Y|X=x(i)).
: One of the assumptions concerning the errors is
: E(e(i)|x(i))=0, so if we draw a scatteplot of the e(i) versus x(i),we would
: have null scatterplot, with no patterns.
: 问题是为什么要做这样一个假设，E(e(i)|x(i))=0说明e(i)和x(i)没有correlation吗
: ？如果是的话，怎么推出来的呢？谢谢!

相关主题
● Regression model 不用 test normality？	● ordinary linear regression assume数据是Normal distribution么？
● Linear regression model 问题请教	● Standard Errors Calculation
● 请教， linear regression model问题	● any one did EM to find MLE of mixed effects model in R
进入Statistics版参与讨论

h***i
发帖数: 3844

Least Square 是没有distribution assumption的

OLS

【在 c*****w 的大作中提到】

: I think E(e|x)=0 does not necessarily mean e is uncorrelated with x. For OLS
: , the assumption is that e|x is normal,i.e. N(0,sigma^2I), which implies
: that e and x are independent.
:
: would

c*****w
发帖数: 50

then why do you want to minimize (Y-X*beta)'(Y-X*beta)

【在 h***i 的大作中提到】

: Least Square 是没有distribution assumption的
:
: OLS

s**c
发帖数: 1247

to get a estimate of beta

【在 c*****w 的大作中提到】

: then why do you want to minimize (Y-X*beta)'(Y-X*beta)

c*****w
发帖数: 50

it is the "normal distribution" assumption that justifies to get MLE of E(Y|
X) from minimizing e'e. If the error term has a distribution other than
gaussian, you will end up with minimizing another metric f(e) that is not L2.
I am talking about OLS in the sense of statistics. Of course, you may have
your own sense to derive OLS.

【在 s**c 的大作中提到】

: to get a estimate of beta

h***i
发帖数: 3844

OLS estimator 不一定是 MLE,就是用method of moment.不是likelihood based
method.
没有用到任何distribution assumption，
当然有normal assumption的话，肯定是MLE.
如果你还有疑问,翻一下Applied linear regression 3rd edition by Sanford
Weisberg
Chapter 2 section 2.4, page 27, 第3段.

Y|
L2.

【在 c*****w 的大作中提到】

: it is the "normal distribution" assumption that justifies to get MLE of E(Y|
: X) from minimizing e'e. If the error term has a distribution other than
: gaussian, you will end up with minimizing another metric f(e) that is not L2.
: I am talking about OLS in the sense of statistics. Of course, you may have
: your own sense to derive OLS.

c*****w
发帖数: 50

Yes, you can also perfectly derive OLS even from its name

【在 h***i 的大作中提到】

: OLS estimator 不一定是 MLE,就是用method of moment.不是likelihood based
: method.
: 没有用到任何distribution assumption，
: 当然有normal assumption的话，肯定是MLE.
: 如果你还有疑问,翻一下Applied linear regression 3rd edition by Sanford
: Weisberg
: Chapter 2 section 2.4, page 27, 第3段.
:
: Y|
: L2.

h***i
发帖数: 3844

算了，不和你争了。呵呵。
OLS不是我derive的，
you can also perfectly derive OLS even from its name 这话可别乱说。

【在 c*****w 的大作中提到】

: Yes, you can also perfectly derive OLS even from its name

c*****w
发帖数: 50

I just think OLS might be used, perhaps, too often, and it is probably
unsafe to apply OLS to data that is actually not normal

【在 h***i 的大作中提到】

: 算了，不和你争了。呵呵。
: OLS不是我derive的，
: you can also perfectly derive OLS even from its name 这话可别乱说。

f*******r
发帖数: 257

OLS estimator is consistent given E(Xe)=0. There is nothing about the
distribution of the error term. It is true that MLE estimator with normal
error term turns out to be the same as the OLS estimator, but that does not
mean the OLS estimator relies on normal error term. If you have normal
error term, that's better: small sample inference is valid. If not, and you
have a relatively large sample, then you are still fine: asymptotic
inference is still valid. The key assumption is E(Xe)=0; bas

h******a
发帖数: 198

一般的情况下，linear regression都认为X是观测到的数据，不是随机变量，而e是一
个随机变量，E（e）=0,var(e)=sigma^2.自然X与e独立。
为什么要这样设定呢？我想regression的目的主要在于
1.寻找数据中的规律，
2.做出预测。
如果e，也就是随机的误差依赖于观测值X，那以上两个目的就很难完全达到。比如说，
如果e依赖X，那我在预测的时候还得考虑X的大小，使问题更复杂了。
另外OLS是不依赖于distribution的。linear regression基本假设是
1.e是一个随机变量，E（e）=0,var(e)=sigma^2
2.corr（e(i)，e（j））=0
再假设e服从normal，为的是令e(i)和e（j）独立，与OLS无关。

s********s
发帖数: 8

This is the weakest assumption for linear regression, which
basically says the estimator should be an unbiased estimator
for E(Y|X=x(i)) or conditional mean .
For your second the question, no, this assumption doesn't say
there's no between e(i) and x(i). We can have a variance matrix
for e(i)'s which is related to x(i)'s.
Hope the above helps. Not necessarily this is the only answer and correct.

would

【在 m*****8 的大作中提到】

(共1页)

进入Statistics版参与讨论

相关主题
● help with R	● Regression中噪音项是一个AR(1)，如何做MLE或者其它Fit？
● 很惭愧的问一个简单的regression algebra.	● Regression model 不用 test normality？
● 请教牛人们关于time series 的 linear regression 问题	● Linear regression model 问题请教
● linear regression的时候	● 请教， linear regression model问题
● 请教LINEAR REGRESSION基本问题	● ordinary linear regression assume数据是Normal distribution么？
● 问个GENERALIZED线性模型中，DESIGN MATRIX变动的影响	● Standard Errors Calculation
● 如何确定什么情况time series，什么情况linear reg？	● any one did EM to find MLE of mixed effects model in R
● 如果dep variable严重skewed，如何做ordinal regression？	● Problem with Maximum Likelihood Estimation

相关话题的讨论汇总
话题: ols话题: regression话题: linear话题: estimator话题: error

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

boards

未名新帖统计// 7月16日

历史上的今天