由买买提看人间百态

topics

全部话题 - 话题: unlisted
首页 上页 1 2 3 (共3页)
s*****n
发帖数: 2174
1
来自主题: Statistics版 - R怎样读文本文件
data <- read.csv("yourfile.txt", header = F)
unlist(strsplit(as.character(data$V1), split=""))
q**j
发帖数: 10612
2
我今天又试了一下,接近成功了一点。
y = unlist(lapply(Data$Date, seq,length=12))
z = matrix(y,nrow=8640,ncol=12,byrow=T)
可以生成这样的一个矩阵。
但是有几个新问题。
1. lapply不让我用by="1 month"这样的参数,所以我得到了12个连续的日子,而不是
月份。
2. 生成的z是数字而非日期。我检查了,这个数字是正确从19700101开始的天数,请问
如何把这样一个数字矩阵转换成为日期矩阵?
最后一个小问题:as.date和as.Date有什么区别?看了一遍manual没有什么概念。
y****2
发帖数: 34
3
来自主题: Statistics版 - R 问题
I am sorry, in the "select" term, better to use: -unlist(list2). Good luck!
s*****n
发帖数: 2174
4
Of course tapply() is not the only way to do it. Read the help file of tappl
y(). If it is still unclear to you, you probably should write a loop to achi
eve your goal.
I can give you codes, but I do not suggest you use them if you do not
understa
nd them:
## Assume your data frame is called data
temp <- tapply(
data$Revenue,
paste(data$Sales_person,
as.Date(data$DATE_time),
sep = " "),
sum)
result <- data.frame(
matrix(unlist(strsplit(names(temp), split = " ")),
q**j
发帖数: 10612
5
来自主题: Statistics版 - 请问R:如何把data frame变成一列数
Good.多谢。我用了一个array(unlist())好像很笨的样子。
b*****n
发帖数: 685
6
来自主题: Statistics版 - 请问R:如何把data frame变成一列数
unlist should be enough?
g********r
发帖数: 8017
7
来自主题: Statistics版 - 怎样用R subset character string
unlist()[2]
g********r
发帖数: 8017
8
来自主题: Statistics版 - 怎样用R subset character string
matrix(unlist(),nrow=4)[2,]

4X100
h******e
发帖数: 6
9
来自主题: Statistics版 - 怎样用R subset character string
unlist(lapply(split, function(x) {x[2]}))
d*******1
发帖数: 854
10
来自主题: Statistics版 - 怎样用R subset character string
需要transpose 才行:
split<- strsplit(test$Experiment, '_')
msplit<- t(matrix(unlist(split),nrow=4))
test$time<- msplit[,3]
s*****n
发帖数: 2174
11
这些apply的overhead cost比较高, 所以对于循环体简单的loop, 未必会比for loop快
, 很多时候更慢, 比如:
> system.time(for (i in 1:100000) {1+1})
[1] 0.11 0.00 0.11 NA NA
> system.time(lapply(1:100000, function(i) {1+1}))
[1] 0.18 0.00 0.19 NA NA
如果仅仅是用apply来代替循环, 意义可能不是很大. 大多数apply都是用于某种直接的
计算, 很方便.
在几个apply当中, lapply是最基本的, sapply, tapply, apply本质上都是lapply的包
装, 大多数时候lapply稍快一些, 但是另外几个往往看上去更简洁. 比如:
> data <- data.frame(
+ id = rep(1:1000, each = 1000),
+ value = rnorm(1000 * 1000)
+ )
>
> system.time(unlist(lapply(spl
f***a
发帖数: 329
12
来自主题: Statistics版 - R 求助
a <- rep(1,15)
b <- rep(1:3,each=5)
tt <- c( "1,1,1,1,1,1,1,1,1,1,1,1,1,1,1"
,"1,1,1,1,1,2,2,2,2,2,3,3,3,3,3")
ind.a <- which(apply(data.frame(tt),1,function(t)
sum(as.numeric(unlist(strsplit(t,",",fixed=T)))-a)
)==0)
tt是你要检验的vector, ind.a是结果(index of elements which can be replaced
by a). 同理要检验b的话,在checking function中用b替换a就行了。
希望我看懂了你的意思,呵呵~
这个是假定element长度都是15的情况,不是的话添加检验长度的语句会更efficient些。
f***a
发帖数: 329
13
来自主题: Statistics版 - 简单的R问题
xx <- as.numeric(unlist(strsplit(x,"_")))
xx[!is.na(xx)]
f***a
发帖数: 329
14
来自主题: Statistics版 - 问个面试问题
回来了回来了。
重新想了下,这个其实就是在一堆iid variables之间加了一个constraint。貌似
sample起来不难。
以最简单的n=2,m=1为例:
Without constraint, outputs space is {(0,0),(0,1),(1,0),(1,1)}.
The corresponding probability space is {(1-p1)*(1-p2), ..., p1*p2}.
With constraint, outputs space is O={(0,1),(1,0)}.
The corresponding probability space is P={(1-p1)*(p2), p1*(1-p2)}.
Under the constrain, standardize the probability space into
P.std={P1/(P1+P2),P2/(P1+P2)}.
Then under constrain, output (0,1) has the probability P1/(P1+P2) to be
sam... 阅读全帖
f***a
发帖数: 329
15
来自主题: Statistics版 - 问个面试问题
回来了回来了。
重新想了下,这个其实就是在一堆iid variables之间加了一个constraint。貌似
sample起来不难。
以最简单的n=2,m=1为例:
Without constraint, outputs space is {(0,0),(0,1),(1,0),(1,1)}.
The corresponding probability space is {(1-p1)*(1-p2), ..., p1*p2}.
With constraint, outputs space is O={(0,1),(1,0)}.
The corresponding probability space is P={(1-p1)*(p2), p1*(1-p2)}.
Under the constrain, standardize the probability space into
P.std={P1/(P1+P2),P2/(P1+P2)}.
Then under constrain, output (0,1) has the probability P1/(P1+P2) to be
sam... 阅读全帖
s*****n
发帖数: 2174
16
来自主题: Statistics版 - R data.frame
I just give you a hint, of course you need to modify it to fit what you need
. for example
> data
V1
1 ABCDE
2 ABCDE
3 ABCDE
> t(sapply(1:dim(data)[1], function(i) unlist(strsplit(data$V1[i], split = "
"))))
[,1] [,2] [,3] [,4] [,5]
[1,] "A" "B" "C" "D" "E"
[2,] "A" "B" "C" "D" "E"
[3,] "A" "B" "C" "D" "E"
s*****n
发帖数: 2174
17
就用seq.Date就可以啊, 为什么不行?
顶多就是稍微做个简单的wrapper而已.
date.increment <- function(date.list, by){
num.days <- unlist(lapply(1:length(date.list),
function(i) seq.Date(from = as.Date(date.list[i]), by = by, length =
2)[2]))
return(as.Date(num.days, origin = "1970-01-01"))
}
date.increment(c("2010-10-01", "2010-10-10"), by = "1 month")
a********s
发帖数: 188
18
Use "unlist" and then "matrix".
l*********s
发帖数: 5409
19
来自主题: Statistics版 - 问一个用R计算年龄的问题
say, d <- "12/10/2001",
datastruct <- as.numeric( unlist( strsplit(d, "/")) )
datastruct is a tuple of (month, day,year). You shall be able to figure out
the rest stuff on your own now.
baozi plz.
n*********e
发帖数: 318
20
来自主题: Statistics版 - How can I do this in R?
I am trying to achieve this:
- for each customer, how many unique products that customer has ordered?
Here is data -
#----------------------
customer_id, product_id, date
11111,634578,11/12/2011
11111,987654,11/12/2011
11111,678978,11/12/2011
11111,678978,12/22/2011
22222,456789,12/24/2011
33333,678978,01/10/2012
33333,678978,01/15/2012
44444,987365,03/30/2012
Here is my R code -
#-------------------------------------------------------------------
t<-read.table('C:\user_item_dt.txt',sep=',',head... 阅读全帖
c***z
发帖数: 6348
21
来自主题: Statistics版 - 问一个R的问题
能详解一下么
我倒是找到了一个法子
首先列出目录下的文件和子目录,文件直接下载,子目录调用本函数(递归)
但是下载下来的文件大小不对,大侠能帮忙看看么
library("RCurl")
# ==========================================================================
====
# Function that downloads files from URL
# ==========================================================================
====
fdownload <- function(sourcelink) {
# sourcelink <- ftp.root # test, root level
# sourcelink <- dirs[1] # test, second level
targetlink <- paste(dropbox.root, substr(sourcelink, nchar(ftp.root)+... 阅读全帖
c***z
发帖数: 6348
22
来自主题: Statistics版 - 问一个R的问题
I got a working version now:
#=====================================================================
# Function that downloads files from URL
#=====================================================================
fdownload <- function(sourcelink) {
# sourcelink <- ftp.root # test, root level
# sourcelink <- dirs[1] # test, second level
targetlink <- paste(dropbox.root, substr(sourcelink, nchar(ftp.root)+1,
nchar(sourcelink)), sep = '')

# list of contents
filenames <- getURL(sourceli... 阅读全帖
c*********t
发帖数: 340
23
来自主题: Statistics版 - 再问一个R问题
想不出更好的办法,对rcurl不是很熟
但是有个笨办法供lz参考
既然是fixed length就找出想要的column的位置:)
> grep("M",unlist(strsplit(files[1],"")))
47
> substr(files,47,47+11)
[1] "Mar 26 16:16" "Mar 26 17:02" "Mar 28 10:05" "Mar 28 10:05" "Mar 28 10:
05" "Mar 28 10:05" "Mar 28 10:05" "Mar 28 10:05" "Mar 28 10:05"
[10] "Mar 28 10:05"
c*****m
发帖数: 4817
24
来自主题: Statistics版 - 1个简单的R question
你想得到什么呢? 一个vector? 用你的例子,x value以内rank y的结果就是y吧
unlist(tapply(y, x, rank))
s*********e
发帖数: 1051
25
来自主题: Statistics版 - 求问一个R apply 函数的问题
matrix(unlist(your_result), ncol = 3, dimnames = list(1:5, c('mean', 'sd', '
max')))
w*******9
发帖数: 1433
26
来自主题: Statistics版 - 问一下R的读取数据问题
unlist() will do the job
p*****n
发帖数: 265
27
来自主题: Statistics版 - 问一下R的读取数据问题
谢谢,output是
[1] "data.frame"
是这个问题吗?不过我用 unlist()就好了,呵呵
t*****w
发帖数: 254
28
来自主题: Statistics版 - 请问面试 R 应该怎么准备?
When I had my job interview, they always tested my SAS skill.However I use R
all the time. To help your preparation, read my R codes to see how much you
can understand it.
%in%
?keyword
a<-matrix(0,nrow=3,ncol=3,byrow=T)
a1 <- a1/(t(a1)%*%spooled%*%a1)^.5 #standadization in discrim
a1<- a>=2; a[a1]
abline(h = -1:5, v = -2:3, col = "lightgray", lty=3)
abline(h=0, v=0, col = "gray60")
abs(r2[i])>r0
aggregate(iris[,1:4], list(iris$Species), mean)
AND: &; OR: |; NOT: !
anova(lm(data1[,3]~data1[,1... 阅读全帖
d*******7
发帖数: 118
29
s<-"123ABC45"
m<-unlist(strsplit(s,""))
paste(m[grep("[A-Z]",m)[1]:nchar(s)],collapse="")

123ABC45
f***8
发帖数: 571
30
来自主题: DataSciences版 - 问个R的问题
"任取一个数据,找出和它最‘相似’的10个数据,"这个好办,用apply就可以,比如:
sort(apply(mtcars[-1, ], 1, function(x) cor(x, unlist(mtcars[1, ]))),
decreasing=TRUE)[1:10]
至于categorical的数据如何处理,需要自己定义一个cor 函数.
p****r
发帖数: 46
31
# create matrix from applist, then transpose it
# so the matrix is N rows * 10 columns
app <- t(data.frame(applist))
# Same for scorelist
score<- t(data.frame(scorelist))
# generate column sequence (1,11,2,12...10,20) so as to reorder them after
cbind
cols <- rep(1:10,each=2)+rep(c(0,10),10)
# or you can do cols <- unlist(sapply(1:10,function(x) list(x,x+10)))
data <- cbind(app,score)
# reorder columns
data <- data[,cols]
# generate col_names: "applist1", "scorelist1", "applist2","scorelist2"...... 阅读全帖
Y****a
发帖数: 243
32
my guess is:
tmpTestIn@data@data@Dim[2] = 50
topn = 1000
Q1: length(v2) = 50, length(v2[[1]]) = 1000
v2 <- unlist(v2)
Q2: assume now your create df returns a data.frame of dim 50,000 x 7
dfall <- do.call('rbind',lapply(range of i, function(i) create df)
Q3:
dfall$v8 <- F(v4, v5, v6)
dfall$v9 <- 0
for (u in userIDs) {
for (i in itemIDs) {
idx <- which((dfall$userID == u) & (dfall$itemID == i))
dfall$v9[idx] <- order(dfall$v8[idx])
}
}
l****r
发帖数: 21884
33
来自主题: _ChenChuSheng版 - youtube上看山楂树之恋
2:
http://www.youtube.com/watch?v=pGnkMMJ73gI
3.
http://www.youtube.com/watch?v=hso-2eHWFEg
4.
http://www.youtube.com/watch?v=2TN0R4iNilE
11.
http://www.youtube.com/watch?v=WGFKdMECGwo
其他的在华人上面也没有贴出来, 我看这些video都是unlisted, 也就是说搜索不到的,
必须有人给链接才行, 希望华人那个楼主继续给出链接吧.
华人原帖:
http://www.huaren.us/dispbbs.asp?boardid=358&Id=811051
w*******y
发帖数: 60932
34
Ebay is running a triple points offer for today 12/13 only. Additionally
there is a double points for gift cards through gift mall. For some reason
they are doing an unlisted combo on the gift cards giving you 4X Ebay points
(though it's not 6X points which would be 12%). I turns out to be 8% in
Ebay bucks.
Link:
http://stores.ebay.com/giftcardmall
I received $25 gift cards to Rock bottom with $2 Ebay Bucks for each one.
Combine with Discover Card 2% cashback for internet purchases through the
e... 阅读全帖
D**p
发帖数: 293
35
来自主题: _Stockcafeteria版 - 美国经济回顾与预测---18点摘要
nothing wrong being a frog.
As you correctly pointed out, it should be 19 ponts, rather than 18. There
are also some other unlisted points. But I believe those 18 points are the
most important ones. Without this 3.5% thing, there still might be a second
ressession, but the chance is smaller. It might be carefully managed into an
L-shapped recovery. However, I do not see a real significant difference
between an L-shpped recovery and a small double-dip. In some people's mind,
an L-shpped recovery
l*********y
发帖数: 3447
36
来自主题: _Xiyu版 - 以前想给小杨做个宣传片
sorry, change to unlisted
首页 上页 1 2 3 (共3页)