Adaboost M1 - what's wrong with this code? - Statistics版

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

Statistics版 - Adaboost M1 - what's wrong with this code?

相关主题
● 弱问个用R fit GLM的问题	● ordered label in ROCR
● How to fit a smoothed line in R?	● 这段R logistic regression code有没有问题？
● 请教一个SAS画图	● Question Proc GENMOD
● 问SAS code：将pred(Y)跟实际(Y)在一个图里比较	● 某著名医药外企招统计Senior Manager (转载)
● help: proc logistic	● SAS macro question
● 哪位用R做过CART MODEL	● 包子求助Survival问题，很菜，请大家包涵
● How to read in binary data in SAS	● 问一个关于R 的问题
● 借人气问两个问题：	● 请教高手，下面这一段R code 是什么意思？谢谢

相关话题的讨论汇总
话题: losses话题: cnt话题: g1话题: sum话题: m1

进入Statistics版参与讨论

(共1页)

z**k
发帖数: 378

My understanding is, for Adaboost M1, the loss function mean(-y*F) is always
strictly decreasing, but this is not the case for the following code. Can
anyone help?
I m following the example of Hastie ESL-II chapter 10.1.
sorry cannot type Chinese here. Thank you very much for help.
#================R Script====================
## Data using example given in T. Hastie, ESL, chapter 10.1
dta <- matrix(rnorm(20000), 2000, 10)
pred <- apply(dta, 1, function(x) sum(x^2))
y <- (pred > qchisq(0.5, 10)) * 2 - 1
## fit y with a two nodes classificaiton tree in x
## randomly sample 19 spliting points and choose the best one
stump <- function(y, x, w) {
## randomly sample 19 splitting points
ss <- quantile(x, probs=seq(0.05, 0.95, by=0.05) + runif(19) * 0.025)
## compute the loss for each splitting point
losses <- numeric(0)
preds <- list()
cnt <- 1
for (s in ss) {
inx <- x < s

G1 <- rep(-1, length(y))
G1[inx] <- 1
losses[cnt] <- sum(as.integer(y != G1) * w)
preds[[cnt]] <- G1

G2 <- -G1
losses[cnt+1] <- sum(as.integer(y != G2) * w)
preds[[cnt+1]] <- G2
cnt <- cnt+2
}
i <- which(losses == min(losses))[1]
preds[[i]]
}
## Start the Adaboost M1 algorithm
w <- rep(1/2000, 2000)
F <- rep(0, 2000)
losses <- rep(NA, 400)
mrates <- rep(NA, 400)
m <- 0
while (m <= 400) {
m <- m + 1
G <- stump(y, dta[, sample(1:10, 1)], w)
err <- sum(w[G != y]) / sum(w)
w <- w * exp(alpha * (y != G))
w <- w / sum(w)
F <- F + alpha * G
losses[m] <- mean(exp(-y*F))
mrates[m] <- mean(y*F < 0)
##cat(sum(exp(-y*F)), "n")
}
table((F > 0)*2 - 1, y)
par(mfrow=c(2,1))
plot(losses, type="l")
plot(mrates, type="l")

(共1页)

进入Statistics版参与讨论

相关主题
● 请教高手，下面这一段R code 是什么意思？谢谢	● help: proc logistic
● 请问高手，这样计算cohen's D， effect size对吗？	● 哪位用R做过CART MODEL
● sensitivity confidence interval in R	● How to read in binary data in SAS
● 什么是Error in forecast(A, h = 4, level = 95) : unused argument (level = 95)？	● 借人气问两个问题：
● 弱问个用R fit GLM的问题	● ordered label in ROCR
● How to fit a smoothed line in R?	● 这段R logistic regression code有没有问题？
● 请教一个SAS画图	● Question Proc GENMOD
● 问SAS code：将pred(Y)跟实际(Y)在一个图里比较	● 某著名医药外企招统计Senior Manager (转载)

相关话题的讨论汇总
话题: losses话题: cnt话题: g1话题: sum话题: m1

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

boards

未名新帖统计// 7月16日

历史上的今天