c****s 发帖数: 63 | 1 我的问题是:
我现在有cost为outcome的数据,要用gamma distribution来fit,
但是难题是predictors有1000个,所以又要同时选择predictor.
我的想法:
1.如果用stepwise方法来选,就可以既fit gamma 又同时select variables. 在R中可
以实现,但老板所stepwise不好,让用lasso来select variables.
2.如果直接先用lasso选变量,再去fit model,好像也不太对。因为lasso fit的
是least Angle regression, 不是基于gamma distribution. 应该不能适用于cost
data.
不知道大家遇到这种问题该怎么办,SAS, Stata or R, 那个能解决这个问题呢?
或者有什么好的建议,先谢谢了! |
c****s 发帖数: 63 | 2 是不是我说的不清楚啊,我修改了一下,还望大家帮帮忙!! |
s*********e 发帖数: 1051 | 3 i think it is ok to use lasso and here is why.
there are 2 parameters in gamma, scale and shape parameters. when shape
parameter is large, gamma converges to gaussian. so if you are working on
large sample, you should be fine. |
i*******n 发帖数: 227 | 4 confused by your question.
LAR is only a solution for lasso, and LAR is nothing to do with statistical
assumptions. Why do you want to use lasso but worry about LAR?
I guess what you want is a gamma fitting model with L1-norm constraint, am I
right? |
c****s 发帖数: 63 | 5 Thanks for your reply!
Yes, you are right.
So according what you said, can I use lasso in 'proc glmselect' in SAS to
find the best variables and then put those variables in 'proc genmod'?
Or do you have any suggestions? Thanks! |
o***o 发帖数: 43 | 6 你应该把model的penalized likelihood写出来,看看自己能不能optimize.
有篇文章或许有用:L1-regularization path algorithm for generalized
linear models。
http://www-stat.stanford.edu/~hastie/Papers/JRSSB.69.4%20%282007%29%20659-677%20Park.pdf |