d******e 发帖数: 7844 | 1 容易计算这一点就足够了。
Parametric的Bayesian Model可以直接在Sampling的每一步得到closed-form solution
。不用Conjugate Prior,算积分算死。最终的直接结果就是Parametric的Bayesian
Model能很轻松的处理大规模数据,estimate几万几十万个甚至更多parameter都不在话
下。
我还没见过有一般的MCMC玩超过100个dimension以上的问题。 |
|
z*******n 发帖数: 15481 | 2 你是frequentist 还是 bayesian?
frequentist的话算MLE 然后inverse of fishier information matrix是estimate的
variance
bayesian的话用MCMC sampler去sample 那些参数的posterior distribution 然后
sample mean作为参数的估计值
如需更多细节 发几个包子吧 我给你发pdf文件 呵呵 |
|
k*****u 发帖数: 1688 | 3 是不是可以把做选择的比较概率打出来,貌似那一段一直都是选择的同一个数字。 |
|
D***e 发帖数: 21 | 4 你得把full conditionals给出来了。 |
|
|
x*******i 发帖数: 1791 | 6 尼马。你这图看的很眩阿。我也想拥有一个。
写个MH还用调别的函数么,这不就是几行命令的事么?
初步感觉啊,你需要normalize一下。parameter已经上了几万了,然后再调调你
proposal,把variance 弄大点。
还有一个问题就是你的Prior。有的比较神奇的Prior容易struk,比如说dirichlet
distribution。这个就看你的具体问题了。 |
|
f********9 发帖数: 526 | 7 与你用哪种语言没关系吧,与模型呀,prior呀之类的有关 |
|
|
|
k*********g 发帖数: 46 | 10 From Jonathan:
I have a question on the SAS proc MIANALYZE.
I have a dataset containing missing data, and I used proc MI (MCMC) to
generate 5 imputations
Then I ran logistic regression to the imputed dataset (by_imputation), and
then ran proc MIANALYZE to combine the results.
One of my variables, Education, was in 3 categories (
high school, >high school). The proc MIANALYZE give me the p-value of each
dummy comparing to the ref categories (HS vs HS vs
w... 阅读全帖 |
|
z******n 发帖数: 397 | 11 呃,我觉得你说的不大对。
你给的页面里面提到:
What are the techniques for dealing with complete separation or quasi-
complete separation?
... ...
Exact method is a good strategy when the data set is small and the model is
not very large. Below is a sample code in SAS.
proc logistic data = t2 descending;
model y = x1 x2;
exact x1 / estimate=both;
run;
这表明exact logistic regression可以用来解决data complete separation的问题。
但complete separation并不是degenerate
按我的理解,degenerate distribution在exact test里面是指所关心的参数a的充分统
计量T的条件分布是退化的... 阅读全帖 |
|
p********a 发帖数: 5352 | 12 ☆─────────────────────────────────────☆
cici (full house) 于 (Mon Nov 7 08:33:47 2011, 美东) 提到:
对于logistic regression
log(pi/1-pi)=b0+b1x1+b2x2
我现在已知independent variables和response variable{log(pi/1-pi)}
我要怎么做才能把参数b0,b1,b2 fit出来?非常感谢
☆─────────────────────────────────────☆
sleephare (I+don't+know.) 于 (Mon Nov 7 14:16:38 2011, 美东) 提到:
SAS, R?
☆─────────────────────────────────────☆
cici (full house) 于 (Mon Nov 7 16:19:05 2011, 美东) 提到:
R,thanks
☆────────────────────────────────────... 阅读全帖 |
|
c********d 发帖数: 253 | 13 Propensity score will create large bias when data is not monotone missing.So
I don't recommend that approach. A lot of methods can be used in your case
if you only have one missing variable, such as hot-deck, predictive mean
matching using a logistic model. You can also use multivariate probit model
for your case since race is nominal. By using multivariate probit model, it'
s easy to develop MCMC algorithm to do multiple imputation. |
|
g*******u 发帖数: 148 | 14 I am working at a small research company. My typical job is to bring academic ideas into the industry by implementing the methods proposed in academic papers. Because usually the authors do not share their codes, I have to replicate the papers from scratch by myself most of the time.
So far I have replicated papers in those very technical journals such as Marketing Science, Biometrika, etc. The most difficult one I've experienced which is in Management Science even requires the knowledge of reve... 阅读全帖 |
|
C*******I 发帖数: 339 | 15 Your job sounds interesting! May I ask what is the job title? Statistician?
Typically, what kind of companies have such job positions?
academic ideas into the industry by implementing the methods proposed in
academic papers. Because usually the authors do not share their codes, I
have to replicate the papers from scratch by myself most of the time.
Marketing Science, Biometrika, etc. The most difficult one I've experienced
which is in Management Science even requires the knowledge of reversible... 阅读全帖 |
|
g*******u 发帖数: 148 | 16 Yes, I concur with your comments, but as I emphasized, the real value of
Bayesian is that it makes many hopeless things before become feasible now.
Because of confidentiality, I am sorry I cannot talk too mcuh about my projects. One thing I could share is couple of weeks ago I worked on a conjunctive screening rule model which relies heavily on MCMC approaches. Following the authors' algorithm I was unable to get any sensible result and so we finally gave it up. That was the first time I underst... 阅读全帖 |
|
l*********s 发帖数: 5409 | 17 That is very interesting!
academic ideas into the industry by implementing the methods proposed in
academic papers. Because usually the authors do not share their codes, I
have to replicate the papers from scratch by myself most of the time.
Marketing Science, Biometrika, etc. The most difficult one I've experienced
which is in Management Science even requires the knowledge of reversible
jump/birth-death MCMC to handle Bayesian spline regressions.
in when to stop if you smell the paper is garba... 阅读全帖 |
|
l******n 发帖数: 9344 | 18 I am not asking for details of your projects.
Bayesian approach has been advertized for long time, you can see it in many
presentations. People from academy, making some simple assumptions and doing
a demo, blow a bubble. But I have not seen any real industrial level
implementation and application. So I am curious that your company really
uses it if you replicate the paper results?
projects. One thing I could share is couple of weeks ago I worked on a
conjunctive screening rule model which reli... 阅读全帖 |
|
g*******u 发帖数: 148 | 19 My bad. Now I got you. The quick answer is, yes, people in my field are serious in the use of Bayesian . You can go check:
http://www.sawtoothsoftware.com/products/cbc/cbchb.shtml
As you can see, this is a module for the implementation of hierarchical
Bayes. The module is only 5 MB in size but asks for USD $2,000!
In the design phase we have several different methods to create surveys, while in the analytics phase the underlying method is all about random-effect logit model using hierarchical Ba... 阅读全帖 |
|
c*******o 发帖数: 8869 | 20 Why Proc MCMC is evaluated poorly? Can you be more specific on that?
serious in the use of Bayesian . You can go check:
while in the analytics phase the underlying
★ 发自iPhone App: ChineseWeb - 中文网站浏览器 |
|
M**Z 发帖数: 111 | 21 俺一般是用的Fortran进行MCMC运算。R相对用得较少些,主要就是速度太太太太太太慢。
如果是用R的话,你弄大RAM可能比CPU更有效。
至于SAS,俺没用到大规模数据上,无法提供意见。 |
|
|
|
|
|
P******V 发帖数: 83 | 26 Cong!lz, 我也是做bayesian, mcmc, 年底毕业,看到lz两个offer,倍感鼓舞啊~
再次恭喜,坐等lz的66题bq,肯定会对我们这些水深火热找工作的人很大的帮助的 ~ |
|
b********1 发帖数: 2861 | 27 有没有人用R 或者 Matlab 做过RJMCMC model? 我知道Winbugs有函数可以做normal
distribution的RJMCMC, 但是我现在做的Poisson 和NB model 不能用.求哪位大侠指
点在选择proposal方面的经验。多谢! |
|
|
g*******i 发帖数: 258 | 29 RJMCMC is widely used in time-varying parameter context, for example:
factor models (for the number of factors)
mixture models (for the number of mixtures)
spline regressions (for the number of node points)
However, I do have concern about its usability in practice. Some
applications I read are really trashy
RJMCMC |
|
b********1 发帖数: 2861 | 30 我想做mixture model+variable selection, 就是同时允许模型空间和变量空间变化。
请问你为什么如此惊讶呢?是模型太复杂还是结果不可靠?多谢!
RJMCMC |
|
|
b********1 发帖数: 2861 | 32 我觉得单纯的mixture model 或者variable selection都可以通过稍微简单的方式解决
,但是RJMCMC看起来更fancy一些,而且我想实现模型空间和变量空间都变化,暂时只
是YY啦。
我是做土木工程的,在统计方面做的fancy一点,写文章比较容易忽悠,跟专业的统计
系或者经济系的人思维模式不一样。 |
|
k*****u 发帖数: 1688 | 33 支持一下 我做research的时候做过这个,用来解决mixture model,还有更fancy的,
比如birth-death process, dirichlet process用来解决mixture model
但是工作以后发现什么都用不着。只要搞各种regression就好了。 |
|
k*****u 发帖数: 1688 | 34 ps 其实em算法做mixture model已经够好了
取不同的初始值然后repeat个1000次最后用平均做参数估计,好像效果已经很好了。 |
|
a***g 发帖数: 2761 | 35 em可以用来做mm是因为连续变量起码可以写似然的式子
如果是树结构的mm,通常连似然函数都没有很好的形式,所以在用b-d process,这也
是为什么很多做基因进化的人用它来解决问题
实际上不是什么方法fancy用什么而是看手头的数据符合什么类型才用什么 |
|
h*******h 发帖数: 14 | 36 slow and bad mixing performance, esp when your model space is large... |
|
K*****2 发帖数: 9308 | 37 搭车求问,RJMCMC是不是连十几个变量的都很难做? |
|
c******s 发帖数: 18 | 38 Cincinnati, OH。R熟练。RODBC熟练。至少会generalized mixed effect modeling和
MCMC,会其它machine learning那方面的models更好 |
|
c******s 发帖数: 18 | 39 地点:Cincinnati, OH
首先说明,我没有招Ph.D.的budget。您要是Ph.D.,麻烦您先私信通知一下您的salary
expectation。请报一个数,请不要报一个range,也请不要说open。
sponsor绿卡,H-1。但是如果您现在是cap-exempt的H-1,那对不起了。不支持eVerify
。您要打算用29个月OPT,那只好不谈。
您有兴趣的话请把简历用私信发这里。这样不成的话大家也好说话。格式乱就乱
要求,最重要的是statistics,要求generalized linear mixed effect model;
您要说您懂的话那请详细说说都用过什么variance structures,为什么用。要求懂MCMC;
您要说您懂的话请详细说说都写过什么likelihood,为什么。
其次请详细说一下您理解的整个modeling的过程,从有data开始到deploy the model
for production
对machine learning algorithm也有要求。您懂的话请说说您作过什么,为什么用那些
models
对SQL,simu... 阅读全帖 |
|
c******s 发帖数: 18 | 40 没空去研究钻风的和谐器。只好贴原来不详细的内容
Cincinnati, OH。R熟练。至少会generalized mixed effect modeling和
MCMC,会其它machine learning那方面的models更好 |
|
T***y 发帖数: 43 | 41 numerical recipe 3里面有没有? |
|
|
z****e 发帖数: 19 | 43 要会侃,要会卖。
要别人一提到比如metropolis-hastings,要眼睛一亮,眉飞色舞告诉她这个是MCMC算
stationary distribution的时候如果只能得到conditional distribution的一个
proportion, 这个算法提供一个更新参数的途径,用winbug就好,你看哥当年做的一个
project是这样的。 |
|
k*****u 发帖数: 1688 | 44 数据里面有一些变量有missing value,大家一般怎么处理?或者有没有什么经验,比
如多少obs是missing就把这个变量drop掉?
看到的几个办法:
impute:用mean median来impute效果不好
dummy var:if missing(var) then miss_var=1; else 0; 也没有什么提高
用regression,或者sas proc mi的mcmc,tree什么的impute
一般还有别的什么办法?
谢谢
还有什么处理missing问题的经验或者技巧么?
谢谢 |
|
h***x 发帖数: 586 | 45
Global mean/median does not work well. However you can try group mean/median
imputation if you can setup some criteria to segment your database.
This is a really good way. You don't think it is useful just because you
have never used/tried it. Sometimes missing value makes lots of business
sense. Why it is missing, does the missing tell us something ...
I really do not like regression imputation. Say you impute you data using
regression or mcmc and build a model on it. Then you need to apply th... 阅读全帖 |
|
r******e 发帖数: 244 | 46 谢谢,我做了一个 一维的问题,但是似乎还是收敛的不好。
请问你能帮我看下code么?
prior PDF:1/(1+x^2)
likehood: exp(-0.5*n*(x-y)^2)
measurement: y=x+v, v~N(x,1)
set n=1
"
xc = X(:,i-1);
p_c = -0.5*n*(xc-y)^2;
xs = xc + randn(Dim,1)*2;
p_s = -0.5*n*(xs-y)^2;
CC = exp(p_s-p_c)*(1/(1+xs^2))/(1/(1+xc^2));
if rand < CC
X(:,i) = xs;
else X(:,i) = xc;
end
" |
|
a*****a 发帖数: 286 | 47 我做的东西和这个有关。我把我的模型transform成state space model,来估计我的模
型。我的估计方法是bayesian mcmc。可私聊。
换。 |
|
g*******u 发帖数: 148 | 48 MCMC is not parallelizable |
|
A*****n 发帖数: 243 | 49 MCMC没法并行化,但是你的20k个subject可以并行化,而不是使用loop
最简单的一个方法就是用parallel包的mclapply函数,如果
机器的cpu比较多的话,时间就可以省下啦了。
apply的速度和for loop没有太多的区别,google一下可以发现有很多的
apply vs. for速度比较的blog。 |
|
a******e 发帖数: 119 | 50 m in 2:m
是因为我先 initialize了lamda,alpha和beta
如何整体update,m in 2:m 是MCMC的5000个iterations。 g in 1:g 是20k个subject |
|