由买买提看人间百态

topics

全部话题 - 话题: cbind
首页 上页 1 2 (共2页)
n*********e
发帖数: 318
1
I am doing an R logistic regression exercise -
My question is - 是否要先从validation set 中删掉 dependent variable, 然后再 run
prediction?
谢谢。
--------------------
library(MASS)
attach(birthwt) #The famous 'low birth weight' data for logistic regression
index <- 1:dim(birthwt)[1]
test<- sample(index, trunc(length(index)/3))
train<-birthwt[-test,]
validation <- birthwt[test,]
logit.1<-glm(low~., data=train, family=binomial(link='logit'))
logit.1
#------------------------------
#这里是否要先从validation set 中删掉 dep... 阅读全帖
t******g
发帖数: 372
2
来自主题: Statistics版 - How to do 'look up' in R?
datanew<-cbind(data, as.character(ref[match(data[, 'prod_id'], ref[,'prod_id
']),'prod_name']))
c*****l
发帖数: 1493
3
simulating a very basic model: Y|b=X*\beta+Z*b +\sigma^2* diag(ni);
b~N(0,\psi) #bivariate normal
where b is the latent variable, Z and X are ni*2 design matrices, sigma is
the error variance,
Y are longitudinal data, i.e. there are ni measurements for object i.
Parameters are \beta, \sigma, \psi; call them \theta.
I wrote a EM, the M step is to maximize the log(f(Y,b;\theta)) as the
regular way,
the E step involves the evaluation of E step, using Gau... 阅读全帖
O*****y
发帖数: 222
4
a <- read.csv("A.csv", header=TRUE)
b <- read.csv("B.csv", header=TRUE)
new1 <- cbind(a, b[, setdiff(colnames(b), colnames(a))])
new2 <- b[, setdiff(colnames(b), colnames(a))]
f******9
发帖数: 267
5
来自主题: Statistics版 - R read many files
请教一个问题,In R, 如何同时 read in 1000 files,然后 assign 不同的名字 to
each file, dat1, dat2, ... , dat1000
这些 files, row length 一样,col length 不同,如何 cbind 这1000 files into a
single file?
a****y
发帖数: 91
6
来自主题: Statistics版 - Question for Stratify sampling.
I am trying to understand the sampling from the following description. Does
anyone know how they get the sample stratum sizes: 10,5,10,4,6. Thanks a lot!
Generates artificial data (a 235X3 matrix with 3 columns: state, region,
income).
# The variable "state" has 2 categories (nc and sc).
# The variable "region" has 3 categories (1, 2 and 3).
# The sampling frame is stratified by region within state.
data=rbind(matrix(rep("nc",165),165,1,byrow=TRUE),matrix(rep("sc",70),70,1,
byrow=TRUE))
data=cbi... 阅读全帖
a**j
发帖数: 60
7

Using R:
#dendogram for unsplitted efron2004 data set
install.packages("care")
library("care")
install.packages("rpart")
library("rpart")
install.packages("partykit")
library("partykit")
data(efron2004)
attach(efron2004)
efron2004
efron2004_rpart<-rpart(y~x[,1]+x[,2]+x[,3]+x[,4]+x[,5]+x[,6]+x[,7]+x[,8]+x[,
9]+x[,10], data=efron2004, control=rpart.control(numsplit=10))
plot(as.party(efron2004_rpart),main = "Dendogram of Y~ Xis for Efron data
sets", font.main = 4)
#dendograms for splitted efron20... 阅读全帖
t*****w
发帖数: 254
8
来自主题: Statistics版 - 怎样来选这些dyads
answer is the following;
your final result is the following:
student teacher senior
2732 3465 1
3347 3837 1
1179 1693 1
3875 1711 1
3875 2059 1
2032 1784 1
2848 3921 1
2148 1416 1
3038 1434 1
3530 2037 1
2585 3811 1
1481 3954 1
... 阅读全帖
c***z
发帖数: 6348
9
## build data frame
work <- c(12, 14, 4, 16, 12, 20, 25, 8, 24, 28, 4, 15)
edu <- c(6,3,8,8,4,4,1,3,12,9,11,4)
income <- c(34.7, 17.9, 22.7, 63.1, 33.0, 41.4, 20.7, 14.6, 97.3, 72.1, 49.1
, 52.0)
studay.df <- data.frame(cbind(work, edu, income))
## linear model
model_3 <- lm(income ~ ., data = studay.df) # OLS
summary_table <- data.frame(summary(model_3)$coefficients)
colnames(summary_table) <- c("coef", "std.error", "t_value", "p_value")
summary_table$regressor <- row.names(summary_table)
s... 阅读全帖
f*******m
发帖数: 94
10
来自主题: Statistics版 - 问个简单的SAS问题!
我的问题是这样的:在long form 的数据里面需要加一列,在R里面是这样做:
xx <- rep(1:20, each=20)
yy <- rep(xx, each=100)
然后用cbind将yy作为一列合并到原来的数据当中,请问如何在SAS里面完成这样的操作
呢?问题非常简单,有点不好意思来问,但是确实不知道怎么做,非常感谢走过的路过
的能帮个忙!
祝大家正在找工作的都能找到工作,已经找到工作的都工作顺利,谢谢!
w*****1
发帖数: 473
11
我在画图之前设置了图的宽度和高度,
下面是我的code:
mhtdata=read.table('mht-bsbp.txt',head=T)
> png("bsbp.png",width=4,height=3)
> data <- with(mhtdata,cbind(chr,pos,PVALUE))
> par(las=2, xpd=TRUE, cex.axis=1.4, cex=1.2)
> color <- rep(c("black","red"),11)
> ops <- mht.control(colors=color,yline=1.5,xline=3,srt=0)
> mhtplot(data,ops,pch=19)
Loading required package: grid
结果出现这样的提示:
Error in plot.new() : figure margins too large
该如何设置参数呢?谢谢
s***y
发帖数: 1130
12
png("bsbp.png",width=4,height=3,units="in")
默认单位是像素点pixel啊。。

我在画图之前设置了图的宽度和高度,
下面是我的code:
mhtdata=read.table('mht-bsbp.txt',head=T)
> png("bsbp.png",width=4,height=3)
> data <- with(mhtdata,cbind(chr,pos,PVALUE))
> par(las=2, xpd=TRUE, cex.axis=1.4, cex=1.2)
> color <- rep(c("black","red"),11)
> ops <- mht.control(colors=color,yline=1.5,xline=3,srt=0)
> mhtplot(data,ops,pch=19)
Loading required package: grid
结果出现这样的提示:
Error in plot.new() : figure margins too large
该如何设置参数呢?谢谢
v*******e
发帖数: 133
13
来自主题: Statistics版 - 求一个简易的R Code
下面code可以,但是我觉得还是太复杂了
Product=c("A","A","A","B","B","C")
Color=c("red","yellow","black","yellow","white","black")
df1=data.frame(Product,Color)
b=aggregate(Color~Product, data = df1, FUN=paste, collapse = " ")
c <- strsplit((b$Color), " ")
maxLen <- max(sapply(c, length))
d<- as.data.frame(t(sapply(c, function(x) c(x, rep(" ", maxLen - length(x)))
)))
colnames(d) <- paste("Color", 1:maxLen, sep="")
df2=cbind(df1[,-c(2)], d)
m*****n
发帖数: 3575
14
来自主题: Statistics版 - 新手学R的困惑。
R是一门很讨厌的语言;语法很不规范,里面有很多经验性的东西。
不存在绝对意义上的学会。你把R in Nutshell全学会算入门,R in Action全会了算进
阶。但是这还不算够,很多东西只有在工作中碰到,发现是坑,才算学会。
例如循环里面忌讳用 c, cbind, rbind这么惨痛的坑,哪本R教材写过?

发帖数: 1
15
来自主题: Statistics版 - 新手学R的困惑。
R里面用循环本来就是坑, 有经验的用户都会尽量避免for循环。 什么c,cbind,
rbind更是能不用就别用,除非万不得已,都是能提前申明先申明。
R的很多经验教训都是在论坛里藏着,确实没有一本书在那,需要找个有经验的人看看
你的code再给你提点意见。
c***z
发帖数: 6348
16
来自主题: DataSciences版 - generating percentile-percentage charts
老板又有新花样,这次要cumulative的percentages
patient_percentiles_cum <- patient_percentiles_fin[, c(1,102)]
colnames(patient_percentiles_cum)[2] <- "top.0"
for (k in 1:100) {
# k <- 1

temp <- patient_percentiles_fin[, c(102:(102-k))]

top <- apply(temp,
1,
FUN = sum)
top <- data.frame(top)

patient_percentiles_cum <- cbind(patient_percentiles_cum,
top)

colnames(patient_percentiles_cum)[2+k] <- paste("top",
... 阅读全帖
k*******a
发帖数: 772
17
list里面vector如果都一样长,可以转换为data frame
比如
cbind(df$itemsetID, as.data.frame(applist), as.data.frame(scorelist))
再调调列顺序就可以了
然后合并就可以了
n*****3
发帖数: 1584
18
if 长度相同,都是N
as.data.frame converge them to data frame.
then cbind
then write.csv
t*****e
发帖数: 364
19
sorry I mean using the package tree, rpart seems to be only for regression
tree
below is some sample code, 写的太仓促难免有bug, 不过你可以大概看一下logic
require(tree)
A_train = matrix(rnorm(20000),nc=20)
Label = rnorm(nrow(A_train))
df_train = data.frame(A_train,Label)
A_test = matrix(rnorm(20000),nc=20)
Label_test = rnorm(nrow(A_test))
df_test = data.frame(A_test,Label)
Prob_all = NULL
for (k in 1:100) {
index = sample(1:length(Label),length(Label),replace = T)
indF = sample(NonSelected,floor(length(NonSele... 阅读全帖
t*****e
发帖数: 364
20
sorry I mean using the package tree, rpart seems to be only for regression
tree
below is some sample code, 写的太仓促难免有bug, 不过你可以大概看一下logic
require(tree)
A_train = matrix(rnorm(20000),nc=20)
Label = rnorm(nrow(A_train))
df_train = data.frame(A_train,Label)
A_test = matrix(rnorm(20000),nc=20)
Label_test = rnorm(nrow(A_test))
df_test = data.frame(A_test,Label)
Prob_all = NULL
for (k in 1:100) {
index = sample(1:length(Label),length(Label),replace = T)
indF = sample(NonSelected,floor(length(NonSele... 阅读全帖
k*******a
发帖数: 772
21
来自主题: DataSciences版 - R问题请教
m <- matrix(0, nrow = max(i), ncol = max(j))
m[cbind(i, j)] <- count
s****h
发帖数: 3979
22
来自主题: DataSciences版 - R问题请教
多谢各位回复啊
给m[cbind(i, j)] <- count 点个赞
俺怎么没想到呢?
首页 上页 1 2 (共2页)