关于cbind的讨论汇总 - 话题女王

d*********k
发帖数: 1239

来自主题: Statistics版 - rbind或者cbind时候，vector的长度不一样怎么办呢？求助

比如
A<-c(1,2,3,4,5)
B<-c(1,2,3)
如果直接用cbind：
> cbind(A,B)
A B
[1,] 1 1
[2,] 2 2
[3,] 3 3
[4,] 4 1
[5,] 5 2
怎么得到下面这样的形式呢？
A B
[1,] 1 1
[2,] 2 2
[3,] 3 3
[4,] 4
[5,] 5
谢谢啦啊~~

u*****3
发帖数: 796

来自主题: Statistics版 - rbind或者cbind时候，vector的长度不一样怎么办呢？求助

cbind(A,c(B,' ',' '))

t*******i
发帖数: 742

来自主题: Statistics版 - rbind或者cbind时候，vector的长度不一样怎么办呢？求助

有个叫rbind.fill或者cbind.fill的命令
是一个人写的package
搜一下好了

B**W
发帖数: 2273

来自主题: Statistics版 - rbind或者cbind时候，vector的长度不一样怎么办呢？求助

B <- c(B, rep(NA,length(A)-length(B)))
cbind(A,B)

k*******a
发帖数: 772

来自主题: Statistics版 - Generate and Retrieve Many Objects with Sequential Names

比如楼主的例子可以直接用
plist <- lapply(1:10, function(x) {
set.seed(x)
smp <- Boston[sample(1:nrow(Boston), nrow(Boston), replace = TRUE), ]
glm <- glm(medv ~ ., data = smp)
predict(glm, Boston)
})
cbind函数应该输入一系列的vector，所以作用到plist，因为他是list，所以把它当
vector来了，最后就得到一列的data frame, 这个函数的argument是不定的，所以是
cbind(...)
do.call("cbind", plist)就是把 plist里面的每个element当作一个argument，所以相
当于 cbind(plist[[1]], plist[[2]],...)

塞？

t*****w
发帖数: 254

来自主题: Statistics版 - 请问面试 R 应该怎么准备？

When I had my job interview, they always tested my SAS skill.However I use R
all the time. To help your preparation, read my R codes to see how much you
can understand it.
%in%
?keyword
a<-matrix(0,nrow=3,ncol=3,byrow=T)
a1 <- a1/(t(a1)%*%spooled%*%a1)^.5 #standadization in discrim
a1<- a>=2; a[a1]
abline(h = -1:5, v = -2:3, col = "lightgray", lty=3)
abline(h=0, v=0, col = "gray60")
abs(r2[i])>r0
aggregate(iris[,1:4], list(iris$Species), mean)
AND: &; OR: |; NOT: !
anova(lm(data1[,3]~data1[,1... 阅读全帖

p****r
发帖数: 46

来自主题: DataSciences版 - 板上R高手多，包子求R数据输出到CSV方法

# create matrix from applist, then transpose it
# so the matrix is N rows * 10 columns
app <- t(data.frame(applist))
# Same for scorelist
score<- t(data.frame(scorelist))
# generate column sequence (1,11,2,12...10,20) so as to reorder them after
cbind
cols <- rep(1:10,each=2)+rep(c(0,10),10)
# or you can do cols <- unlist(sapply(1:10,function(x) list(x,x+10)))
data <- cbind(app,score)
# reorder columns
data <- data[,cols]
# generate col_names: "applist1", "scorelist1", "applist2","scorelist2"...... 阅读全帖

R****n
发帖数: 708

来自主题: Biology版 - TCGA microRNA表达水平

my simple R code. Try to read the basic R, and you should be able to get it
in a few weeks.
setwd("C:/Users/meng09/Downloads/BRCA/BCGSC__IlluminaHiSeq_miRNASeq/Level_3")
lst<-list.files()
exprtable<-c()
Nam<-c()
for (f in lst) {
temp<-read.delim(f)
Nam<-c(Nam,as.character(temp[1,"barcode"]))
exprtable<-cbind(exprtable,temp[,"reads_per_million_miRNA_mapped"])
}
exprtable[1:5,1:5]
exprs<-as.data.frame(exprtable)
samp<-Nam
probe<-as.character(temp[,"miRNA_ID"])
names(exprs)<-samp
rownames(exp... 阅读全帖

e********o
发帖数: 12

来自主题: Statistics版 - problem in R again ... sorry

不好意思还是有几个地方不太懂
1.我想最后用cbind(L0i,T0i)把它组成您说的matrix
可以吗?
2. if (tmp1 再次谢谢
太感激了
L0i <- numeric(0)
T0i <- numeric(0)
for (i in 1 : 2) {
tmp1 <- rexp(1, rate = exp(1))
tmp2 <- rexp(1, rate = exp(1))
if (tmp1 < tmp2) {
L0i <- tmp1 T0i <- tmp2
}
i <- i+1
}
cbind(L0i,T0i)

a********a
发帖数: 346

来自主题: Statistics版 - R memory urgent help

Thanks goldmember again, here is what I did,
I used cbind statement to combine the matrix,i.e, I have a loop for i in 1
to 15. In the first iteration, I get estimates in 1 row *8 col, second
iteration, I get estimates in 2 row*8col，third iteration in 3*8.....,for
15th iteration, I get 15*8 estimates, then I cbind these estimates to a
matrix with (1+2+......15)row*8col.
Actually I want to repeat this process for 200 times, but it stopped before
it can finish one time.

a**h
发帖数: 19

来自主题: Statistics版 - 再一个R 问题

lines(smooth.spline(!is.na(cbind(x,y)),df=4),col='red',lwd=3)
data （x，y）有missing，想把missing delete，记得是 !is.na(cbind(x,y)),
but error shows: need at least four unique 'x' values
多谢帮助！

S******y
发帖数: 1123

来自主题: Statistics版 - How to paralell logistic regression estimation?

I have finally got Hadoop working on my Linux box. Next I would like to try
to see if I could to parallel model estimation for some commonly used models
such as logistic regression.
My question now is - how to paralell gradient descent for logistic model
estimation for real large data set?
Any thoughts would be greatly appreciated. Thanks in advance!
PS. See R code below. If needed, I could rewrite the following code in Java
or Python. But the question is how to decompose the following estimatio... 阅读全帖

t******e
发帖数: 16

来自主题: Statistics版 - Generate and Retrieve Many Objects with Sequential Names

每个循环里面，有对data的resample、fit model、predict，这些怎么往lapply里面塞？
求示范。
不过，pcols可以放在循环里面，在每一步结束的时候存结果，用来避免生成p1、p2的
技术问题。
请高手解释一下，为什么do.call('cbind',plist)和直接cbind(plist)出来的结果不一
样？帮助文件说得很含糊。

d******e
发帖数: 7844

来自主题: Statistics版 - 讨论个问题，classification 的label 非常不平均

这说明你没有理解问题所在。
> n = 100000
> X = matrix(runif(n*2),n,2)
> y0 = sign((X[,1]<0.1)-0.5)
> y = (y0*sign(runif(n)-0.1)+1)/2
> sum(y==1)
[1] 17998
> sum(y==0)
[1] 82002
> out = glm(y~X,family="binomial")
> yhat=sign(cbind(X,rep(1,n))%*%out$coefficients>0)
> sum((yhat==1)*(y==1))
[1] 2
> sum(yhat==y)
[1] 82003
> idx1 = which(y==1)
> idx0 = which(y==0)[1:length(idx1)]
> out = glm(y[c(idx0,idx1)]~X[c(idx0,idx1),],family="binomial")
> yhat=sign(cbind(X,rep(1,n))%*%out$coefficients>0)
> sum((yhat==1)*(y==1... 阅读全帖

c***z
发帖数: 6348

来自主题: Statistics版 - generating percentile-percentage charts (转载)

【以下文字转载自 DataSciences 讨论区】
发信人: chaoz (晨钟暮鼓), 信区: DataSciences
标题: generating percentile-percentage charts
发信站: BBS 未名空间站 (Mon Nov 24 20:11:11 2014, 美东)
Spent some time generating this kind of charts from raw data. There might be
better ways of doing so, but I would just post my method and 抛砖引玉。
Raw table has three columns: clinic | age | count, which records the age of
patients, rather, how many of each age category.
Target table has three columns: clinic | age_percentile | count_percentage... 阅读全帖

c***z
发帖数: 6348

来自主题: DataSciences版 - generating percentile-percentage charts

Spent some time generating this kind of charts from raw data. There might be
better ways of doing so, but I would just post my method and 抛砖引玉。
Raw table has three columns: clinic | age | count, which records the age of
patients, rather, how many of each age category.
Target table has three columns: clinic | age_percentile | count_percentage,
which records the percentage of patients in each age category, with the
categories in percentiles form (e.g. if there are only two age categories,
then the... 阅读全帖

m********r
发帖数: 13

来自主题: Programming版 - R plot hel...

matplot(B,cbind(A,C),type="l")

m*****n
发帖数: 3575

来自主题: Programming版 - R语言，小笔记本，如何调参?

关键的问题在于循环里不能用这三个函数
c（）
cbind（）
rbind（）
但凡你用了，慢得出奇。
解决办法是刚开始就建好纯零矩阵，然后填数。
快百倍不止。

l*******l
发帖数: 204

来自主题: Statistics版 - problem in R again ... sorry

L0i <- c()
T0i <- c()
for (i in 1 : 100) {
tmp1 <- rexp(1, rate = exp(1))
tmp2 <- rexp(1, rate = exp(1))
if (tmp1 < tmp2) {
L0i <- c(L0i,tmp1)
T0i <- c(T0i,tmp2)
}
i <- i+1
}
cbind(L0i,T0i)

e********o
发帖数: 12

来自主题: Statistics版 - problem in R again ... sorry

Dear all,
I rewrite my code as below,
hope this one is correct ><
Thanks for any thoughts/comments.
Li <- c()
Ti <- c()
i<-1
while(i <= 4 ) {
tmp1<-rexp(2, rate = exp(1))
Li <- c(Li,min(tmp1))
Ti <- c(Ti,max(tmp1))
i<-i+1
}
cbind(Li,Ti)

q**j
发帖数: 10612

来自主题: Statistics版 - 更新一下Taste of R，再问两个R的问题。

我觉得要找到完美的方案必须要把两个key分开。想你这样做，如果 key1 = 'a b'
key2 = 'c'就会和 key1 = 'a' and key2 = 'b c'混在一起。用别的字符分割paste也
会有不尽如人意的地方。我想看看merge函数如何写的。应该可以找到答案。
p.s. 刚才看看，非常复杂。很难看懂。如果你看懂了，给大家讲讲吧。好像使用的cbind.

a********a
发帖数: 346

来自主题: Statistics版 - help in R

I have a program as following,
beta1=2
beta2=8
#for (i in 1:2){
obs=2
x1=matrix(NA,obs,1)
x2=matrix(NA,obs,1)
for (g in 1:obs){
x1[g,]=runif(1,1,2)
x2[g,]=rnorm(1,1)
}
data=cbind(x1,x2)
data
#d=rbind(data[i])
#}
I run the program 2 times, and get the data like following each time,
> data
[,1] [,2]
[1,] 1.452696 0.2718456
[2,] 1.514587 1.2971293
> data
[,1] [,2]
[1,] 1.172726 -0.9674824
[2,] 1.159079 0.5483838
how can I combine these data by row, i.e. I want to get d

s*****n
发帖数: 2174

来自主题: Statistics版 - help in R

beta1=2
beta2=8
result <- NULL
for (i in 1:2){
obs=2
x1=matrix(NA,obs,1)
x2=matrix(NA,obs,1)
for (g in 1:obs){
x1[g,]=runif(1,1,2)
x2[g,]=rnorm(1,1)
}
data=cbind(x1,x2)
result <- rbind(result, data)
}
result

S******y
发帖数: 1123

来自主题: Statistics版 - How to do Naive Bayes in R?

I am wondering if anybody here have a simple example in R for Naive
Bayes.
For example, I can do k-means clustering on the "iris" data -
data(iris)
cl <- kmeans(iris[,1:4], 3)
cl$cluster
cbind(1:150,iris$Species)
===========
But how to do Naive Bayes classification in the same "iris" data?
Many thanks!

g********r
发帖数: 8017

来自主题: Statistics版 - R memory urgent help

前面我说错了。是memory。limit（）
cbind非常不efficient。每次你的矩阵都要被复制。你这个情况应该先定义存结果的矩
阵，然后逐行赋值就可以了。

before

g********r
发帖数: 8017

来自主题: Statistics版 - R memory urgent help

而且你要用的是rbind.如果错用了cbind，可能导致部分矩阵被多次复制粘在一起。那
内存可就大了。

a********a
发帖数: 346

来自主题: Statistics版 - R memory urgent help

Thanks goldmember again,
Sorry, I used rbind not cbind to combine all the matrix.
Here is the situation,inside each sample, I have a loop for i in 1to 15,
In each iteration of the loop, I get estimates like the following ( here I
only list 4 columns instead of 8 columns, they are fake numbers),
sample category beta beta sigma sigma
1 1 2.0 1.2 2.0 0.8
1 2 1.0 0.8 1.2 0.1
1 2 1.5 0.7 1.3 0.8
1 3 2.0 0.4 1.5 0.2
1

g*******r
发帖数: 270

来自主题: Statistics版 - 求教: 倒数和solve怎么不一样--关于R

x<-cbind(c(1, 1, 1),c( 4, 2, -1))
求 (x'x)^(-1),我分别用ab两种方法
a, (t(x)%*%x)^(-1)
b, solve(t(x)%*%x)
得到的却是两种不同的结果,请教为何?
谢谢?

o******6
发帖数: 538

来自主题: Statistics版 - [合集] 求助:刚开始学习R的菜鸟求教一个比较白痴的问题

☆─────────────────────────────────────☆
sweetandlow (Pepper) 于 (Wed Mar 18 23:30:41 2009) 提到:
刚开始用R, 啥都搞不清, 只能依葫芦话瓢, 大家不要笑我. 帮我看一下这个好吗, 很
急哦.
今天遇到一个问题, 我想应该是一次就可以得到所有答案的,可是我实在不知道怎么弄,
于是我得每次改一个数字, 再输出X, 觉得实在太繁琐了, 请大家教我一下怎么做吧.
...
fm <- glm(cbind(Mim, Total-Mim) ~ Age+ I(Age^2)+I(Age^3), mim, family=
binomial)
tfct <- function(x) predict(fm, newdata=data.frame(Age=x)) - zxx
zxx <- log(0.1/(1-0.1))
uniroot(tfct, range(mim$Age))$root ->X1
X1
zxx <- log(0.2/(1-0.2))
uniroot(tfct, range(mim$Age))

f******9
发帖数: 267

来自主题: Statistics版 - 请教一个关于R的问题

我有两个表格，怎么样才能把它们combine到一起呢？我用了cbind，但是这样的话，第
二个表格就直接在第一个表格后
了，我想一个column一个column的combine，即第二个表格的第一column combine到第
一个表格的第一个column
后，第二个表格的第二个column combine到第一个表格的第二个column后。。。。。。
怎样才能用R做到呢？谢谢了！

D******n
发帖数: 2836

来自主题: Statistics版 - 请教一个关于R的问题

k=cbind(a,b)
k=k[,order(names(k))]

s*******a
发帖数: 705

来自主题: Statistics版 - 请教一个关于R的问题

cbind(A,B[,names(A)])[,rep(1:ncol(A),each=2)+rep(c(0,ncol(A)),ncol(A))]

i********f
发帖数: 206

来自主题: Statistics版 - 请教一个关于R的问题

可以试试这个
假设你的data.frame是tmp1, tmp2
tmp <- cbind(tmp1,tmp2)
tmp[,2*(1:dim(tmp1)[2])-1] <- tmp1
tmp[,2*(1:dim(tmp1)[2])] <- tmp2

D******n
发帖数: 2836

来自主题: Statistics版 - 【R】保留matrix中某些值

r=which(A>5,arr.ind=T);
result<-cbind(dimnames(A)[[1]][r[,1]],dimnames(A)[[2]][r[,2]])

a***r
发帖数: 420

来自主题: Statistics版 - 怎样用R定位变量的位置

抛砖引玉
#original matrix: A
#new matrix: X
#number of obs:n
#number of var:nvar
> A
var1 var2 var3 var4 var5
1 1 2 3 4 5
2 6 7 8 9 10
3 11 12 13 14 15
4 16 17 18 19 20
>n=4
>nvar=5
>value <-as.vector(t(A))
>varname <-rep(colnames(A),n)
>ID <-sort(rep((1:n),nvar))
>X <-data.frame(cbind(ID,varname,value))
ID var value
1 1 var1 1
2 1 var2 2
3 1 var3 3
4 1 var4 4
5 1 var5 5
6 2 var1 6
7 2 var2 7
8 2 var3

p********a
发帖数: 5352

来自主题: Statistics版 - [合集] 请教一个关于R的问题

☆─────────────────────────────────────☆
fang0219 (miracle) 于 h 提到:
我有两个表格，怎么样才能把它们combine到一起呢？我用了cbind，但是这样的话，第
二个表格就直接在第一个表格后
了，我想一个column一个column的combine，即第二个表格的第一column combine到第
一个表格的第一个column
后，第二个表格的第二个column combine到第一个表格的第二个column后。。。。。。
怎样才能用R做到呢？谢谢了！
☆─────────────────────────────────────☆
fang0219 (miracle) 于 (Thu Mar 18 21:55:21 2010, 美东) 提到:
对了，这两个表格相对应column的header都是一样的。thx!
☆─────────────────────────────────────☆
oloolo (似人非兽) 于 (Thu Mar 18 21:56:14 2010, 美东) 提到:
rb

D******n
发帖数: 2836

来自主题: Statistics版 - R:matrix

cbind(which(X>0.5,arr.ind=T),X[X>0.5])

S******y
发帖数: 1123

来自主题: Statistics版 - 请问现有软件能算多大矩阵的逆矩阵？

#try this Moore-Penrose pseudoinverse function in R
#I wrote it based on Torsten Hothorn s matlab code
> mpinv
function (X)
{
Eps <- 100 * .Machine$double.eps
s <- svd(X)
d <- s$d
m <- length(d)
if (!(is.vector(d)))
return(t(s$v %*% (1/d) %*% t(s$u)))
d <- d[d > Eps]
notnull <- length(d)
if (notnull == 1) {
inv <- 1/d
}
else {
inv <- solve(diag(d))
}
if (notnull != m) {
inv <- cbind(inv, matrix(0, nrow = notnull, nc

f***a
发帖数: 329

来自主题: Statistics版 - 请教高人如何用一个表格的列去替换另一个表格的列？

把含有missing value的原始date set存在“dat_missing.txt”,把补全表分别存成“
cp_a.txt”,"cp_b.txt","cp_c.txt", (别忘了每个文件第一行是column name).然后运
行下面的R code就完了. 输出结果在“out.txt”里面.
dat.m <- read.table("dat_missing.txt",header=T)
out <- matrix(0,nrow(dat.m),ncol(dat.m)-1)
for(lt in letters[1:3])
{
indx <- dat.m[,1]==lt
r <- dat.m[indx,-1]
cp <- read.table(paste("cp_",lt,".txt",sep=""),header=T)[,-1]
dat <- cbind(r,cp)
n <- ncol(r); na <- ncol(cp);
res <- matrix(as.numeric(apply(dat,1,function(t)
{
rr <- t[1:n];

s*r
发帖数: 2757

来自主题: Statistics版 - 这个矩阵推导有什么问题

不是了，test yourself
x1 <- c(1,1,0,0,0,0)
x2 <- c(0,0,1,1,0,0)
x3 <- c(0,0,0,0,1,1)
y <- c(rnorm(n=2, mean=0), rnorm(n=2,mean=2), rnorm(n=2, mean=4))
x <- cbind(rep(1, times=6), x1,x2,x3))
xxt <- x %*% t(x)
xtx <- t(x) %*% x
e.xtx <- eigen(xtx)
e.xxt <- eigen(xxt)
V <- e.xtx$'vectors'; round(V, digits=2)
U <- e.xxt$'vectors'; round(U, digits=2)
s.v <-e.xtx$'values' # single values
# all these matrix have rank = 3;
d <- sqrt(diag(s.v[c(1,2,3)]))
U3 <- U[,c(1:3)]
V3 <- V[,c(1:3)]
round(U3 %*% t(U3), ... 阅读全帖

w******a
发帖数: 25

来自主题: Statistics版 - imputation question?thanks

Here is an R example to impute one missing data in each record,half of the code is to make data sample, you probably only need second half,but including them here helps you understand what is going on:
The data will look like
col1 col2
x
x x
x
x x
x x
...
library(Rlab)
alp = 1
Prob_R1 = 0.5
Prob_R0 = 1 - Prob_R1
len_Y1 = 200
K_delta = 2
Y1 = rnorm(len_Y1,mean=0,sd=1)
R1 = rbinom(n=len_Y1, size=1, prob=Prob_R1)
Y2 = rnorm(n=len_Y1,... 阅读全帖

w******a
发帖数: 25

来自主题: Statistics版 - imputation question?thanks

Here is an R example to impute one or two missing data in each record:
The data will look like
col1 col2 col3
x
x x x
x x
x x
x x x
x
x x x
...
library(Rlab)
alp = 1
K_delta = 2
len_Y1 = 200
#Sample setting:
#Measurment N_
patient Percent
# 1 12
0.18
# 1 2 4
0.05... 阅读全帖

R******o
发帖数: 83

来自主题: Statistics版 - 问 R 的 data type 问题

可以用table做n-waytable,用rbind,cbind之类灵活组合成自己想要的表，修饰一下保
存成tex 文件用 latex 可以生成很漂亮的表。或者存成 txt 文件，让SAS读进去再
print出来。

f***a
发帖数: 329

来自主题: Statistics版 - 【欢迎进来讨论】for loop in R

照例，还是我先胡说几句，:-)
在R里面能不用for loop就不应该用，尽量用vectorize的方式搞定一切。
对matrix/data.frame的row or col做运算，就用apply；（btw, same for array）
要对list, data.frame(essentially it is a list), vector的element做运算就用
lapply, sapply；
对不同id做运算，用tapply
下面是我的问题。
1）
# Way I:
for(i in 1:n){
res[i] <- myfunction(a[i], b[i], c[i])
}
# Way II:
res <- apply(cbind(a,b,c), 1, function(t)
myfunction(t[1], t[2], t[3])
)
这两种方法equivalent还是way II好一些呢？
2)
# Way I:
for(i in 1:n){
input <- i
...... # some heavy calculation
res[i] <- output
}
... 阅读全帖

d*********k
发帖数: 1239

来自主题: Statistics版 - rbind或者cbind时候，vector的长度不一样怎么办呢？求助

谢谢啦啊~
不过我要是有很多vector呢？要是一个一个的数element的个数的话，还是蛮麻烦的呢~
有没有什么
比较通用的方法呢？
谢谢了啊~

d******e
发帖数: 7844

来自主题: Statistics版 - rbind或者cbind时候，vector的长度不一样怎么办呢？求助

Use list.

呢~

d*********k
发帖数: 1239

来自主题: Statistics版 - rbind或者cbind时候，vector的长度不一样怎么办呢？求助

我也想这用list，可是不是很熟悉
你能不能详细点？比如这个简单的例子？谢谢啦啊

b*******g
发帖数: 513

来自主题: Statistics版 - rbind或者cbind时候，vector的长度不一样怎么办呢？求助

Here is an example:
> a<-c(1,2,3)
> b<-c(1,2,3,4)
> c<-list(a,b)
> c
[[1]]
[1] 1 2 3
[[2]]
[1] 1 2 3 4

c**i
发帖数: 234

来自主题: Statistics版 - 如何给一个matrix的rownames起个名字name？R问题求教！

create a vector, cbind(vector, matrix)?

n*********e
发帖数: 318

来自主题: Statistics版 - How can I do this in R?

Thank both of you for replying!
That was a great help to me.
多谢两位回帖！
-------------------------------------
I find that I can also do -
----------------------------------------------
> tp<-tapply(t$product_id, t$customer_id, function(x) length(unique(x)))
> data.frame(cbind(names(tp),tp))
V1 tp
11111 11111 3
22222 22222 1
33333 33333 1
44444 44444 1
--------------------------------------------------
总结如下：
"sapply" and "tapply" can both return either a vector or a list, depending
upon... 阅读全帖

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

topics

未名新帖统计// 7月16日

历史上的今天