t*****w 发帖数: 254 | 1 When I had my job interview, they always tested my SAS skill.However I use R
all the time. To help your preparation, read my R codes to see how much you
can understand it.
%in%
?keyword
a<-matrix(0,nrow=3,ncol=3,byrow=T)
a1 <- a1/(t(a1)%*%spooled%*%a1)^.5 #standadization in discrim
a1<- a>=2; a[a1]
abline(h = -1:5, v = -2:3, col = "lightgray", lty=3)
abline(h=0, v=0, col = "gray60")
abs(r2[i])>r0
aggregate(iris[,1:4], list(iris$Species), mean)
AND: &; OR: |; NOT: !
anova(lm(data1[,3]~data1[,1... 阅读全帖 |
|
q**j 发帖数: 10612 | 2 我看了你的文章。像
r('r_mean <- colMeans(r_data)')
这样用一行还是可以。但是如果我的R里面有很多行,看起来会非常别扭。而且如果
colMeans是一个我自己的R 函数怎么办?
我以前用Matlab也有和R接口的办法,后来还是放弃了。选择output .csv,r input,
then
output another .csv as result。这样的途径。不知道这样会不会影响速度。如果数
据已经大概被Python处理好了,应该问题不太大吧。 |
|
c*2 发帖数: 24 | 3 even simpler: colMeans(A) |
|
f***a 发帖数: 329 | 4 It seems the actual looping of "lapply" is done internally in C code and "
apply" isn't really faster than writing a loop. The main advantage of "apply
" is it simplifies code writing?
colMeans/rowSums() and vectorization of a function are faster than a loop
though.
Anyway, I think, for algorithm with heavy computation involved, C/C++ should
be employed to handle computing part. And I strongly recommend {Rcpp} which
provides much much better API than the original one in R.
(My previous questions... 阅读全帖 |
|
q**j 发帖数: 10612 | 5 thanks a lot for the help! is there anyway that you can use colMeans
function to do the same? |
|
k*******a 发帖数: 772 | 6 c <- by(a,b,colMeans)
c <- do.call("rbind", c) |
|
r****5 发帖数: 618 | 7 非常感谢,我用
Mean<-round(colMeans(subhw),2)
Median<-round(apply(subhw,2,median),2)
。。。。
然后用data.frame(Mean,Median。。)看来是非常繁琐,你的简单多了。
如果我要加入ID和名字,尤其是ID,用原来的column的col number。 例如,1 ,2,3
(1,2,3对应的是column的位置,怎么来实现?如果我要再加入一个col,用来检查是
否有missing data。 这个怎么加到你的里面?
这里你根据summary里的排列列出min, max, 如果要给中间插入一个col如sdev,就像
下面的一样。好像就不能用t(apply。。。 太多问题了,就想把它弄明白。
ID name Mean sdev Min. Max. missing
:1 mpg 20.0900 0.4 10.400 33.900 1
:2 cyl 6.1880 0.2 4.000 8.000 2
:3 di... 阅读全帖 |
|