B******y 发帖数: 9065 | 1 SAS中的PROC GLM的命名有重大的欺骗性,这里的GLM是指General Linear Model,而不
是Generalized Linear Model!!!一字之差,区别很大。PROC GLM是一种ANOVA的扩
展,是基于PROC ANOVA的更新程序(PROC GLM出了以后,PROC ANOVA基本上废了);而
PROC GENMOD才是大家常见的将非正态数据转成正态模型的,所以需要Link Function。
据说SAS公司后来非常后悔PROC GLM的定义,毕竟学术界更习惯于GLM是Generalized
Linear Model的缩写,但PROC GLM出现Generalized Linear Model被普遍接受之前,而
且SAS使用者已经习惯用它了。所以没有办法,只好创造了一个新的PROC GENMOD。 |
|
s*********e 发帖数: 1051 | 2 it has nothing to do with double GLM or not.
from theoretical standpoint, all exponential error GLM is just a special
case of dispersion GLM and can be represented in a general formula form as
shown in Song's paper.
in other words, there are some models fallen outside of classic GLM based on
exponential errors, such as simplex model (which is a very good candidate
models for unity outcomes). |
|
o******6 发帖数: 538 | 3 ☆─────────────────────────────────────☆
hemmingchen (天高海阔) 于 (Wed Jan 28 09:14:52 2009) 提到:
Suppose I have a large data set collected from many different sites. I try
to analyze the data with PROC GLM for two effects (e.g. gender---male and
female, and blood types---O, AB, B, A). When I use PROC GLM for all sites
with combining gender and blood types to be a new variable as Gen_Blood, “
PROC GLM with Tukey lines” works perfectly for all sites together.
However, when I try to use “BY sites”, |
|
P****l 发帖数: 156 | 4 reg只要是 linear function
glm is generalized linear model
主要区别,reg只可以用 continuous independent variable
proc glm 可以用来做任何的 generalized linear model
proc glm 是不用 link function 的
proc genmod 才要指出用什么 link function
其实去 sas网站上的那个support 多看看就就知道了 |
|
D**u 发帖数: 288 | 5 来自主题: Statistics版 - R glm 请问 怎样 让 glm 按照我已经确定的formula运算。
比如我已经确定了 y=2.3*x1+3.1*x2+0.9*x3, 但是想通过glm算出 standard error 之
类的统计量,在glm中怎样specify 这个formula?
谢谢 |
|
A*******s 发帖数: 3942 | 6 i think it is just about the naming.
Double GLM is nothing but modeling the mean and dispersion simultaneously
via traditional GLM framework and this terminology is widely accepted in
actuarial models nowadays.
you may refer to an much ealier paper of Smyth 1988 GLM with varying
dispersion.
on |
|
c*****r 发帖数: 156 | 7 请教版上牛人几个关于GLM的基本问题:
在fit GLM的时候,怎么选取不同的distribution呢?这些distribution是对data而
言还是对residual而言?
谢谢! |
|
q*********i 发帖数: 696 | 8 有一组数据,k是trial的次数,y是success的次数,pred是唯一的predictor
每个observation的trial次数都不一样。
R代码写成下面这样可以吗?
glm (y/k~pred,binomial(link = "logit"), weights =k)
或
glm (y/k~pred,binomial(link = "logit"))
看了半天help也没看明白weights参数在这里的意思,试着fit了一下两个命令得到的参
数差不多。deviance和AIC都是有weight的那个小一些。 |
|
f**********t 发帖数: 1001 | 9 有些疑惑。。。
1.proc GLM到底是Generalized Linear Regression么?看了SAS的文档半
天,没见Link function啊。感觉就对所有的independent variable做
traditional Linear regression了。感觉proc Genmod才是在做
Generalized Linear Regression。
2.proc GLM和proc Reg的差别主要在哪?是不是对于proc Reg而言,
independent variable不能包括categorical variable, nominal
variable和interaction?
非常感谢! |
|
f**********t 发帖数: 1001 | 10 非常感谢!我当时就是觉得Proc GLM太具欺骗性了才问的。哈哈。
回答对我很有帮助。
感觉Proc GLM是兼具Proc ANOVA和Proc Reg的功能。Proc ANOVA确实没怎么见到用过。 |
|
a****m 发帖数: 693 | 11
in Proc GLM model, the input variable can be categorical or continuous,
but only fixed effect only.
whereas PROC REG only count on the continuous input variable, and ANOVA is
for categorical variable.
PROC GLM is extended form of ANOVA, it could be ANCOVA, called analysis of
covariance. it need at least one continuous and at least one categorical
input variable. it is merger of ANOVA and regression for continous
variable. |
|
v*****c 发帖数: 44 | 12 How to call for overall adjusted means and standard error in Proc GLM? Is
there any short forms other than the 'estimate' statement below (I have 9
levels of analsite)?
proc glm;
class treat analsite;
model pchg = analsite treat/ ss3;
lsmeans treat /pdiff cl stderr e;
estimate 'overall' intercept 1 analsite 0.11111111 0.11111111 0.
11111111 0.11111111 0.11111111 0.11111111 0.11111111 0.11111111 0.11111111
treat 0.25 0.25 0.25 0.25 ;
run; |
|
a****g 发帖数: 8131 | 13 kruskal wallis结果是significant,没有特别extreme的数值
但是glm结果不significant,residue normality satisfied
我是不是可以直接接受nonparametric的结果,而不去管glm?
thanks |
|
z**********i 发帖数: 88 | 14 I will appreciate any input you may give.
I am runing glm procedure, but I get one more test significant in contrast
than in pairwise multiple comparison. I don't know why. Below is the code
and output.
proc glm data=out.final;
class dx;
format dx dx4x.;
model &v =dx;
lsmeans dx;
means dx/tukey;
contrast "demented vs normal" dx 1 0 -1;
contrast "demented vs mci" dx 1 -1 0;
contrast "mci vs normal" dx 0 1 -1;
run;
1. Least Squares Means;
Least Squares Means
Q1_PARTAPP
dx ... 阅读全帖 |
|
s******5 发帖数: 513 | 15 the model is
y=mu + beta + g + e
where y is a vector with n individuals
beta is a fixed effect
g=(g1,,,,gm) is random effects
e is the random error
question 1:
we assume var(e)=R*square(sigma_e), where R is an known matrix. How to
incorporate this R matrix in the glm code?
question 2:
g=(g1,,,,gm) is random effects;we assume var(g)=D*square(sigma_g), D is a
known n*m matrix.
Is it to split the matrix D by columns as (d1,....,dm), and take individual
column as a random effect independently? Like:... 阅读全帖 |
|
l******1 发帖数: 292 | 16 我现在要run一个proc glm,class里面有3个variable: A B C;model里面也有5个
variables : A B C D E, varibale C的values 是0,1,2,3, 现在我run的output里面的
C的值是按照C=3作为reference的,我应该加一个什么option就可以让C=0作为reference
?下面是code
proc glm data = all ;
class A B C;
model Y= A B C D E/solution ;
quit;
谢谢大牛们了 |
|
c**********e 发帖数: 2007 | 17 Sure we can write our own macro. But is there a easy
to way to output adjusted R-square in GLM? Thanks. |
|
f*******r 发帖数: 257 | 18 If I understand you correctly: beta0 and beta1 now have a restriction that
g(\pi_0)=beta0+beta1*x0. Therefore, there is only one coefficient to be
estimated. In other words, you can solve for beta0 in terms of beta1; then
your model becomes a restricted glm model. I don't know of a way in genmod to
do it. It seems genmod does not take a restrict statement. Depending on specific model, you may be able to do it with other procedures. For example, you can do proc logistic, if the link is a lo |
|
o****o 发帖数: 8077 | 19 substitute your restriction in terms of \beta_0 into the model
so it becomes
g(\pi)=g(\pi_0) + \beta_1 (X - x0)
=\alpha_0 + \alpha_2 *Z, a stnadard GLM
where Z=(X-x0), \alpha_0=g(\pi_0), \alpha_1=\beta_1
that
restriction |
|
s*r 发帖数: 2757 | 20 glm for generalized least square, which can handle correlated residuals |
|
c*********t 发帖数: 340 | 21 是两个model都可以吗
另外,每个independent variable的beta value的P值怎么refer呀
谢谢
刚开始练R
不知道自己描述得对不对
具体是,
x <- glm(PHENOGROUP2 ~ Combo.new[,e[i]] + PHENOGROUP3, family="binomial",
data=Combo.new)
因为是在一个loop里面,想把每次的结果(beta[slope] 和P)输出到一个matrix里去
但是x$coeff只能有slope,P value不知道用什么来引用
谢谢啦 |
|
v*******g 发帖数: 334 | 22 proc GLM;
assuming A has 5 levels.
contrast 'A LINEAR & QUADRATIC'
a -2 -1 0 1 2------ I understand this is linear .
a 2 -1 -2 -1 2 ---------- Why this is QUADRATIC ? |
|
c****s 发帖数: 63 | 23 现在用GLM model(gamma)来预测cost, 老板想用R2来做goodness of fit, 请问这样
可以吗?
他是想发现一种方法, 只要一看到这个测试的结果就说,‘恩,这个model还不错啦‘
,这样子。请问,大家有什么其它的好办法吗?
望大家多多指教啊!!!! |
|
h**********e 发帖数: 44 | 24 GLM都自己有自己的deviation吧。你用你的模型的dev比上null model的dev,估计是一
个不错的统计量 |
|
c****s 发帖数: 63 | 25 MSE是用来比较几个Model之间的哪一个MSE的数值最小,如果直接用一个Gamma model的
话,一个MSE的数值恐怕不能说它是好还是不好吧。
老板要用R-square是因为看到一个 R-square就可以判断它是不是好的model了。
我虽不太同意这种做法,但确也找不到其他的这样的test来判断一个GLM Model了
真的是好郁闷呀,不知各位大虾还有没有高招了 |
|
d******o 发帖数: 59 | 26 do you want to get the correlations among variables or diagnose
multicollinearity in regression?
If the first, use proc corr.
if second, use vif in proc glm |
|
A*******s 发帖数: 3942 | 27 hehe, I got a tricky interview question about that. I'm not sure whether
newer version SAS provides vif option in proc glm, but in the older version,
it only shows up in proc reg. |
|
h***i 发帖数: 634 | 28 Reg里面有以下语法来做一个F test H0:beta1=0 & beta4=0;
model y=x1-x5;
test x1=0,x4=0;
请问GLM里面对应的语法是什么啊 |
|
y*********s 发帖数: 24 | 29 GLM 和linear regression 区别是知道的,
但是一个数据, 应该用logistic regresson , 却误用了linear regression , 会出
现什么样的后果?
或者这样说, 一个数据可以用logistic regression, 也可以用linear regression
, 但是为什么选择logistic regression, not linear regression ? |
|
a***e 发帖数: 1627 | 30 题目是:
Perform the analysis of the data and identify the best treatment(一个6个,
已经量化为1-6,就是program中的trt).
SAS program :
data latin;
input row col trt yield;
cards;
1 1 3 3.10
1 2 6 5.95
1 3 1 1.75
1 4 5 6.40
1 5 2 3.85
1 6 4 5.30
2 1 2 4.80
2 2 1 2.70
2 3 3 3.... 阅读全帖 |
|
s*r 发帖数: 2757 | 31 proc glm 里面的means 多半是把mean和0比吧 |
|
n**********e 发帖数: 18 | 32 只是想知道这个dv和各iv的关系,最后还想做个model selection来筛选一下,目前就
是用的proc glm,但是又有normality的问题
如果用Kruskal Wallis test也可以分析出dv和各categorical iv的关系,但是后面的
model selection就不知道怎么做了。。。。。。 |
|
h***i 发帖数: 3844 | 33 ..........
你好歹也要先搞清楚这里的glm 和generalized linear model是不是一个东西吧。 |
|
H**********1 发帖数: 3056 | 34 both can do the multi-regression,
I found GLM has no adjust R-square... |
|
y********0 发帖数: 638 | 35 不是大牛.
如果能用logistic的话,尽量用logistic,可以使用(param=ref ref=first),
好像sas 9.3已经把ref=加到 glm 里了.sas9.2里没有可以直接加一个option的功能.
还可以使用proc tranreg重新设定baseline.
要是我的话,直接就把C重新sort成descending顺序,省时间.
reference |
|
l******1 发帖数: 292 | 36 谢谢大牛的回到,但是因为我的Y是continues,所以必须用古罗马而不是logistic,那你
提到的9.3已经把ref=加到 glm 里了,我应该怎么用可以把reference C=0?谢谢 |
|
w*******n 发帖数: 469 | 37 if "proc glm" has no this function, why not use "proc genmod", you could
set the reference level in it. |
|
y********0 发帖数: 638 | 38 不是大牛.
http://support.sas.com/kb/37/108.html
这里提到ref=已经加到sas9.3,我的sas不是9.3.而且glm 9.3里也没有提到这个.
你可以试试
class sex (ref=last) treat (ref=first) / param=ref;
或者class sex (ref=last param=ref);
如果不行,就得重新用别的proc重新定义baseline了.
good luck. |
|
g******2 发帖数: 234 | 39 来自主题: Statistics版 - R glm model=glm(y~x)
summary(model) |
|
Y*****o 发帖数: 1173 | 40 有A B C D 四个variable, D 是用A,B,C的数值计算得出的。 D = (140-A)X B / C
那么请问做glm regression 的时候,A还能和D同时放在model里面吗?既然D
是由ABC计算出的,D跟A的correlation很高,就不能正确估计出A
对outcome的effect了吧? |
|
d**********2 发帖数: 14 | 41 Hi,
在用SAS做ANOVA analysis的时候(我用的是proc glm), 怎么样specify哪些variable是
within-subject effect, 哪些variable是between-subject effect? 我知道所有
repeated measures variables都是within-subject effect, 可是我的data不属于这种
情况啊, 而且如果有多于两个viariables属于within-subject effects, 这个repeated
option很不好用啊, 没办法specify两个或者两个以上的repeated measures
variables.
万分感激!
Laomao |
|
a**j 发帖数: 60 | 42 你要知道是random effect 还是fixed effect, 用factorial model 还是nested model
between-subject effect
within-subject effect
with group, between group?
如果是的话, 用nested model
一个women right 的题 的部分SAS code 及解释
/*Conduct anova specifies that both ciri and wecon are categorial variable,
nest model ciri is nested with wecon specifies that ciri is random variable
. Conduct model diagnostics by plots=diagnostics option*/
ods graphics on;
proc glm data=torture1 PLOTS=DIAGNOSTICS ;
class ciri wecon;
model physint1=weco... 阅读全帖 |
|
A*******s 发帖数: 3942 | 43 赞。
Insurance company比较喜欢搞这类的double GLM。
很少见到银行的应用
也许dispersion可以用来搞stress test |
|
p**5 发帖数: 2544 | 44 what the advantage of fixed model vs GLM?
Thanks |
|
a*******g 发帖数: 80 | 45 I have a group of patients, 2 trt arms and two centers. the outcome is a
continuous variable. I am interested the effect of trt but like to control
the effect of centers. Here is the code I used:
proc glm data=test;
class recon treatment_arm;
model scores=treatment_arm|recon/ss3;
estimate "trt diff"
treatment_arm 1 -1
treatment_arm*recon 0.5 0.5 -0.5 -0.5;
run;
But the log returned me with
"NOTE: trt diff is not estimable.". (no error message)
When I use the proc mixed, it wor... 阅读全帖 |
|
p******p 发帖数: 13 | 46 问了下自己这边的统计师,明确的random effect最好用proc mixed,proc glm貌似是
当成fixed effect处理的。 |
|
i*****r 发帖数: 318 | 47 关于做K fold Cross validation。
1:
在SVM 中,SVM()命令里直接有一个cross=K,也可以做K折交叉验证。
问题是做完以后,svm可以显示预测值,比如
A.svm=svm(y~.,data=XYZ,cross=10)
A.predict=predict(svm)
R问题:在logistic回归,神经网络和SVM中做交叉验证
请问这个拟合的预测值是10折交叉验证里面哪一个预测模型的预测值?
2:在做logistic 回归中,R提供一个cv.glm()指令可以做K折交叉验证,然后显示准确
率是多少。cv.glm()只可以显示预测准确率是多少,请问我在哪里可以看到这个预测模
型的拟合预测值,还有这个模型里面各个变量的参数是多少呢? 比如
A.glm=glm(y~.,data=XYZ,family=binomial)
A.cv.glm=cv.glm(XYZ, A)
这里A.cv.glm只能预测准确率是多少,但我不知道模型的拟合预测值,也不知道公式P=
exp(a1X1+a2X2+...anXn)/(1+exp(a1X1+a2X2+...anXn))里面a1.. |
|
b*****a 发帖数: 905 | 48 lz自己挑著看吧。
## Make a ROC plot to see the ration of the rate of false positives to the
rate of false negatives
library(ROCR)
ip.glm<-glm(votesum ~ clint96 + partyid + aflcio97 + ccoal98, data=ip,
family=binomial(link=logit))
summary(ip.glm)
names(ip.glm)
pred.object<-prediction(ip.glm$fitted.values, ip.glm$y)
perf.object<-performance(pred.object, "tpr", "fpr")
# use add=TRUE as an argument to the plot() function if you want to overlay
additional ROC plots.
# Compare the model with submodel (Democr |
|
p********a 发帖数: 5352 | 49 ☆─────────────────────────────────────☆
cici (full house) 于 (Mon Nov 7 08:33:47 2011, 美东) 提到:
对于logistic regression
log(pi/1-pi)=b0+b1x1+b2x2
我现在已知independent variables和response variable{log(pi/1-pi)}
我要怎么做才能把参数b0,b1,b2 fit出来?非常感谢
☆─────────────────────────────────────☆
sleephare (I+don't+know.) 于 (Mon Nov 7 14:16:38 2011, 美东) 提到:
SAS, R?
☆─────────────────────────────────────☆
cici (full house) 于 (Mon Nov 7 16:19:05 2011, 美东) 提到:
R,thanks
☆────────────────────────────────────... 阅读全帖 |
|
v*******e 发帖数: 11604 | 50 程序员搞起统计来了。。。你问的都是统计问题,不是R的问题。
1)glm 它就是个迭代的算某一类特定model参数的程序/方法,当然要算到收敛为止。
没听说glm里面还有forward/backward/stepwise这类的东西。
2)AIC,BIC这类东西是用来选model的,不是用来算model参数的。model里面要包含哪
些变量,不包含哪些变量(比如没有多少影响的变量就别包含在内了),这是AIC,BIC
之类东西的用处。如果你用它来决定你的general linear model里面需要包含哪些变量
,当然要和glm()交替运用。你先选一些变量做成model,然后用glm()算出这个
model的参数和likelihood,再增/减变量,再用glm()算出参数和likelihood,然后你
就能用AIC决定要不要把这增/减的变量包含在内。
3)wikipedia有简短介绍。
stepwise |
|