any regression model with high prediction accuracy? - Statistics版 - 未名存档

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

Statistics版 - any regression model with high prediction accuracy?

相关主题
● 急问：请教一个muliticollinearity的面试问题，谢谢！	● 问一个linear regression 的弱问题。
● 通常Predictor越多AIC是不是应该越低？	● regression prediction问题
● 大家做过这个面试题吗？	● ks 只有28%
● 攒人品，发Google Statistician/Data Scientist电面面经	● regression problem - go confused
● 做logistic regression，cases很少但是predictor很多	● 电话面试完了，肯定没戏，大家帮我看看题目，就算学习吧
● anybody use minitab?	● KS 的问题
● model和variables都sig.但每个category都不sig	● 新人问个matlab统计方面的问题
● 请教一个面试问题。	● multicollinearity和 predicion model

相关话题的讨论汇总
话题: regression话题: prediction话题: model话题: method话题: accuracy

进入Statistics版参与讨论

1

(共1页)

s******e 发帖数: 841	1 I have a dataset with 1 response variable and 20 predictor variables ( continuous and categorical). The sample size is around 3000. The the result of multiple regression methods is poor (with R2 less than 0.2). I have tried regression tree method, but I can not even form a tree with the dataset (I mean the number of terminal node is only one). Is there any other method that I can try to get a good fit? Maybe I can try to do the transformation with some of the predictors, but how can I find the b
s*****n 发帖数: 2174	2 你的一些基本概念有些混淆. a good (fit) model 和 prediction accuracy 没什么直接关系. model只是用来 fit observed data. 然后以一定的标准(比如最小平方和)来评价好坏. 至于prediction, 这严重依赖于你做predict的时候的assumption和你data本身的性质 . 你的data本身noise term就很大, 也许无论如何你都无法精确predict. 你再怎么找 model也没用. 很小的R^2并不一定说明 model 不好, 或者是存在更好的 model.
s******e 发帖数: 841	3 Thank you for replying. I am not a stastics major. It is an engineering problem. I think first I want to reduce the prediction error as much as possible. That's why I wanted to try regression tree method. But it failed. My question is can I find a method that can give me small prediction error and it does not matter if it is hard to interprete the result. 坏. 【在 s*****n 的大作中提到】 : 你的一些基本概念有些混淆. : a good (fit) model 和 prediction accuracy 没什么直接关系. : model只是用来 fit observed data. 然后以一定的标准(比如最小平方和)来评价好坏. : 至于prediction, 这严重依赖于你做predict的时候的assumption和你data本身的性质 : . 你的data本身noise term就很大, 也许无论如何你都无法精确predict. 你再怎么找 : model也没用. : 很小的R^2并不一定说明 model 不好, 或者是存在更好的 model.
s*****n 发帖数: 2174	4 我不觉得有什么放之四海皆准的程序可以使你降低prediction error. 你唯一能做的, 就是尝试不同的variable selection, 尝试不同的transforation. 如果你的reponse是近似normal的, 尽量把你所有的predictor都往normal上面 transform. 如果response非常skew, 你首先要把response变得近似normal了, 至少也要比较symmetric了. 还有一点就是, 我不知道你是如何选model和评价prediction的. 如果你没有用cross- validation的话, 最好用这个标准. 或者是用AIC做标准也一样, 理论上, AIC 是试图 minimize prediction error 的. 仅仅看 R^2 这些来试图找到predictive model 肯定是不行的.
s******e 发帖数: 841	5 it's too broad, can you specify one?
s*********e 发帖数: 1051	6 with neural networks, you can get a perfect fit, probably over-fit. ^_^ result tried I 【在 s******e 的大作中提到】 : I have a dataset with 1 response variable and 20 predictor variables ( : continuous and categorical). The sample size is around 3000. The the result : of multiple regression methods is poor (with R2 less than 0.2). I have tried : regression tree method, but I can not even form a tree with the dataset (I : mean the number of terminal node is only one). : Is there any other method that I can try to get a good fit? : Maybe I can try to do the transformation with some of the predictors, but : how can I find the b

1

(共1页)

进入Statistics版参与讨论

相关主题
● multicollinearity和 predicion model	● 做logistic regression，cases很少但是predictor很多
● 请问：Age 什么时候应该 ’分段'，什么时候不分呢？	● anybody use minitab?
● categorical data analysis 一问	● model和variables都sig.但每个category都不sig
● R classification tree model 请教	● 请教一个面试问题。
● 急问：请教一个muliticollinearity的面试问题，谢谢！	● 问一个linear regression 的弱问题。
● 通常Predictor越多AIC是不是应该越低？	● regression prediction问题
● 大家做过这个面试题吗？	● ks 只有28%
● 攒人品，发Google Statistician/Data Scientist电面面经	● regression problem - go confused

相关话题的讨论汇总
话题: regression话题: prediction话题: model话题: method话题: accuracy

未名新帖统计// 7月16日

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

* 这里只显示发帖超过25的版面，努力灌水吧:-)