新人问个matlab统计方面的问题 - Statistics版

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

Statistics版 - 新人问个matlab统计方面的问题

相关主题
● 电话面试完了，肯定没戏，大家帮我看看题目，就算学习吧	● 关于Deviance and Pearson Statistics
● 请问：R-square 可以用来评估GLM model得好坏吗	● 急问negative binomial regression的结果的model significance看哪个参数
● 【包子】linear mixed model by R correlation解释	● Maximum Likelihood estimation
● A Model question, urgent please!!	● 面试问题求教(更新了啊)
● any regression model with high prediction accuracy?	● 请大牛推荐cross validation 方面的好Book/Paper
● 通常Predictor越多AIC是不是应该越低？	● how to determine data fit some distribution? thanks
● 弱问个用R fit GLM的问题	● 怎样比较hierarchical model
● R-square of logistic regression	● what is happening if I got Big negarive AIC/BIC? help~

相关话题的讨论汇总
话题: aic话题: glmfit话题: deviance话题: model话题: x3

进入Statistics版参与讨论

(共1页)

d*****y
发帖数: 1058

[b,dev] = glmfit(...)returns dev, the deviance of the fit at the solution ve
ctor. The deviance is a generalization of the residual sum of squares. It is
possible to perform an analysis of deviance to compare several models, each
a subset of the other, and to test whether the model with more terms is sig
nificantly better than the model with fewer terms.
for example:
y--[x1,x2,x3,x4,x5]
y--[x1,x3,x5]
我想知道哪个模型更好一点，我看网上的例子是
[b,dev1]=glmfit([x1,x2,x3,x4,x5],y,'normal');
[b,dev2]=glmfit([x1,x3,x5], y, 'normal');
请问如何用dev1和dev2来确定哪个模型好啊？
我知道可以用AIC来计算，选AIC低的那个，但是AIC怎么在matlab
里面计算啊，想请大牛们confirm一下下面的公式对不对
AIC=dev+2*(length(b))
谢谢

r********n
发帖数: 6979

这不就是做feature selection么
你可以google一下
方法多了去了
forward， backward， AIC， BIC。。。
你说的这个例子其实就是做一个F-test(anova)
如果用更少predictor的那个model能够得到和更多predictor的model相似的fitness
那就应该选那个predictor比较少的
一般来说这样产生的model更robust
如果你只是先快速检验一下那个model可能会更好一点
可以用这个方法
不过我个人不是太喜欢这么做
因为做出来的model在新的数据里面“不一定”更好
如果愿意多花些时间的话
我一般会先做cross validatiaon或者bootstrapping
重复N次
然后比较两个model在test set里面的fitness
然后选择那个fitness error更低的那个
优点是这样做出来的model一般来说更robust
缺点是花的时间可能是几十倍甚至几百倍
如果你的feature比较多的时候
可以先用uni-variate model， forward, backward等方法找到有限的一些feature
然后用以上的方法
just my 2cents

ve
is
each
sig

【在 d*****y 的大作中提到】

: [b,dev] = glmfit(...)returns dev, the deviance of the fit at the solution ve
: ctor. The deviance is a generalization of the residual sum of squares. It is
: possible to perform an analysis of deviance to compare several models, each
: a subset of the other, and to test whether the model with more terms is sig
: nificantly better than the model with fewer terms.
: for example:
: y--[x1,x2,x3,x4,x5]
: y--[x1,x3,x5]
: 我想知道哪个模型更好一点，我看网上的例子是
: [b,dev1]=glmfit([x1,x2,x3,x4,x5],y,'normal');

(共1页)

进入Statistics版参与讨论

相关主题
● what is happening if I got Big negarive AIC/BIC? help~	● any regression model with high prediction accuracy?
● R 如何自动保存结果到PDF里面？	● 通常Predictor越多AIC是不是应该越低？
● 用什么参数来评估Non-linear Regression Model?	● 弱问个用R fit GLM的问题
● AIC for training data and hold-out data	● R-square of logistic regression
● 电话面试完了，肯定没戏，大家帮我看看题目，就算学习吧	● 关于Deviance and Pearson Statistics
● 请问：R-square 可以用来评估GLM model得好坏吗	● 急问negative binomial regression的结果的model significance看哪个参数
● 【包子】linear mixed model by R correlation解释	● Maximum Likelihood estimation
● A Model question, urgent please!!	● 面试问题求教(更新了啊)

相关话题的讨论汇总
话题: aic话题: glmfit话题: deviance话题: model话题: x3

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

boards

未名新帖统计// 7月16日

历史上的今天