第3页 - 关于unlisted的讨论汇总 - 话题女王

全部话题 - 话题: unlisted

s*****n
发帖数: 2174

data <- read.csv("yourfile.txt", header = F)
unlist(strsplit(as.character(data$V1), split=""))

q**j
发帖数: 10612

我今天又试了一下，接近成功了一点。
y = unlist(lapply(Data$Date, seq,length=12))
z = matrix(y,nrow=8640,ncol=12,byrow=T)
可以生成这样的一个矩阵。
但是有几个新问题。
1. lapply不让我用by="1 month"这样的参数，所以我得到了12个连续的日子，而不是
月份。
2. 生成的z是数字而非日期。我检查了，这个数字是正确从19700101开始的天数，请问
如何把这样一个数字矩阵转换成为日期矩阵？
最后一个小问题：as.date和as.Date有什么区别？看了一遍manual没有什么概念。

y****2
发帖数: 34

来自主题: Statistics版 - R 问题

I am sorry, in the "select" term, better to use: -unlist(list2). Good luck!

s*****n
发帖数: 2174

来自主题: Statistics版 - How to compute sum of revenue for each day each person in R?

Of course tapply() is not the only way to do it. Read the help file of tappl
y(). If it is still unclear to you, you probably should write a loop to achi
eve your goal.
I can give you codes, but I do not suggest you use them if you do not
understa
nd them:
## Assume your data frame is called data
temp <- tapply(
data$Revenue,
paste(data$Sales_person,
as.Date(data$DATE_time),
sep = " "),
sum)
result <- data.frame(
matrix(unlist(strsplit(names(temp), split = " ")),

q**j
发帖数: 10612

来自主题: Statistics版 - 请问R：如何把data frame变成一列数

Good.多谢。我用了一个array(unlist())好像很笨的样子。

b*****n
发帖数: 685

来自主题: Statistics版 - 请问R：如何把data frame变成一列数

unlist should be enough?

g********r
发帖数: 8017

来自主题: Statistics版 - 怎样用R subset character string

unlist()[2]

g********r
发帖数: 8017

来自主题: Statistics版 - 怎样用R subset character string

matrix(unlist(),nrow=4)[2,]

4X100

h******e
发帖数: 6

来自主题: Statistics版 - 怎样用R subset character string

unlist(lapply(split, function(x) {x[2]}))

d*******1
发帖数: 854

来自主题: Statistics版 - 怎样用R subset character string

需要transpose 才行：
split<- strsplit(test$Experiment, '_')
msplit<- t(matrix(unlist(split),nrow=4))
test$time<- msplit[,3]

s*****n
发帖数: 2174

来自主题: Statistics版 - 问个R里面avoid for loop的问题(sapply,lapply...)

这些apply的overhead cost比较高, 所以对于循环体简单的loop, 未必会比for loop快
, 很多时候更慢, 比如:
> system.time(for (i in 1:100000) {1+1})
[1] 0.11 0.00 0.11 NA NA
> system.time(lapply(1:100000, function(i) {1+1}))
[1] 0.18 0.00 0.19 NA NA
如果仅仅是用apply来代替循环, 意义可能不是很大. 大多数apply都是用于某种直接的
计算, 很方便.
在几个apply当中, lapply是最基本的, sapply, tapply, apply本质上都是lapply的包
装, 大多数时候lapply稍快一些, 但是另外几个往往看上去更简洁. 比如:
> data <- data.frame(
+ id = rep(1:1000, each = 1000),
+ value = rnorm(1000 * 1000)
+ )
>
> system.time(unlist(lapply(spl

f***a
发帖数: 329

来自主题: Statistics版 - R 求助

a <- rep(1,15)
b <- rep(1:3,each=5)
tt <- c( "1,1,1,1,1,1,1,1,1,1,1,1,1,1,1"
,"1,1,1,1,1,2,2,2,2,2,3,3,3,3,3")
ind.a <- which(apply(data.frame(tt),1,function(t)
sum(as.numeric(unlist(strsplit(t,",",fixed=T)))-a)
)==0)
tt是你要检验的vector, ind.a是结果（index of elements which can be replaced
by a）. 同理要检验b的话，在checking function中用b替换a就行了。
希望我看懂了你的意思，呵呵~
这个是假定element长度都是15的情况，不是的话添加检验长度的语句会更efficient些。

f***a
发帖数: 329

来自主题: Statistics版 - 简单的R问题

xx <- as.numeric(unlist(strsplit(x,"_")))
xx[!is.na(xx)]

f***a
发帖数: 329

来自主题: Statistics版 - 问个面试问题

回来了回来了。
重新想了下，这个其实就是在一堆iid variables之间加了一个constraint。貌似
sample起来不难。
以最简单的n=2,m=1为例：
Without constraint, outputs space is {(0,0),(0,1),(1,0),(1,1)}.
The corresponding probability space is {(1-p1)*(1-p2), ..., p1*p2}.
With constraint, outputs space is O={(0,1),(1,0)}.
The corresponding probability space is P={(1-p1)*(p2), p1*(1-p2)}.
Under the constrain, standardize the probability space into
P.std={P1/(P1+P2),P2/(P1+P2)}.
Then under constrain, output (0,1) has the probability P1/(P1+P2) to be
sam... 阅读全帖

f***a
发帖数: 329

来自主题: Statistics版 - 问个面试问题

s*****n
发帖数: 2174

来自主题: Statistics版 - R data.frame

I just give you a hint, of course you need to modify it to fit what you need
. for example
> data
V1
1 ABCDE
2 ABCDE
3 ABCDE
> t(sapply(1:dim(data)[1], function(i) unlist(strsplit(data$V1[i], split = "
"))))
[,1] [,2] [,3] [,4] [,5]
[1,] "A" "B" "C" "D" "E"
[2,] "A" "B" "C" "D" "E"
[3,] "A" "B" "C" "D" "E"

s*****n
发帖数: 2174

来自主题: Statistics版 - 请问在R里面如何increment一整列date？

就用seq.Date就可以啊, 为什么不行?
顶多就是稍微做个简单的wrapper而已.
date.increment <- function(date.list, by){
num.days <- unlist(lapply(1:length(date.list),
function(i) seq.Date(from = as.Date(date.list[i]), by = by, length =
2)[2]))
return(as.Date(num.days, origin = "1970-01-01"))
}
date.increment(c("2010-10-01", "2010-10-10"), by = "1 month")

a********s
发帖数: 188

来自主题: Statistics版 - [R问题]how to make matrix from list (or the other way around)

Use "unlist" and then "matrix".

l*********s
发帖数: 5409

来自主题: Statistics版 - 问一个用R计算年龄的问题

say, d <- "12/10/2001",
datastruct <- as.numeric( unlist( strsplit(d, "/")) )
datastruct is a tuple of (month, day,year). You shall be able to figure out
the rest stuff on your own now.
baozi plz.

n*********e
发帖数: 318

来自主题: Statistics版 - How can I do this in R?

I am trying to achieve this:
- for each customer, how many unique products that customer has ordered?
Here is data -
#----------------------
customer_id, product_id, date
11111,634578,11/12/2011
11111,987654,11/12/2011
11111,678978,11/12/2011
11111,678978,12/22/2011
22222,456789,12/24/2011
33333,678978,01/10/2012
33333,678978,01/15/2012
44444,987365,03/30/2012
Here is my R code -
#-------------------------------------------------------------------
t<-read.table('C:\user_item_dt.txt',sep=',',head... 阅读全帖

c***z
发帖数: 6348

来自主题: Statistics版 - 问一个R的问题

能详解一下么
我倒是找到了一个法子
首先列出目录下的文件和子目录，文件直接下载，子目录调用本函数（递归）
但是下载下来的文件大小不对，大侠能帮忙看看么
library("RCurl")
# ==========================================================================
====
# Function that downloads files from URL
# ==========================================================================
====
fdownload <- function(sourcelink) {
# sourcelink <- ftp.root # test, root level
# sourcelink <- dirs[1] # test, second level
targetlink <- paste(dropbox.root, substr(sourcelink, nchar(ftp.root)+... 阅读全帖

c***z
发帖数: 6348

来自主题: Statistics版 - 问一个R的问题

I got a working version now:
#=====================================================================
# Function that downloads files from URL
#=====================================================================
fdownload <- function(sourcelink) {
# sourcelink <- ftp.root # test, root level
# sourcelink <- dirs[1] # test, second level
targetlink <- paste(dropbox.root, substr(sourcelink, nchar(ftp.root)+1,
nchar(sourcelink)), sep = '')

# list of contents
filenames <- getURL(sourceli... 阅读全帖

c*********t
发帖数: 340

来自主题: Statistics版 - 再问一个R问题

想不出更好的办法，对rcurl不是很熟
但是有个笨办法供lz参考
既然是fixed length就找出想要的column的位置：）
> grep("M",unlist(strsplit(files[1],"")))
47
> substr(files,47,47+11)
[1] "Mar 26 16:16" "Mar 26 17:02" "Mar 28 10:05" "Mar 28 10:05" "Mar 28 10:
05" "Mar 28 10:05" "Mar 28 10:05" "Mar 28 10:05" "Mar 28 10:05"
[10] "Mar 28 10:05"

c*****m
发帖数: 4817

来自主题: Statistics版 - 1个简单的R question

你想得到什么呢? 一个vector? 用你的例子，x value以内rank y的结果就是y吧
unlist(tapply(y, x, rank))

s*********e
发帖数: 1051

来自主题: Statistics版 - 求问一个R apply 函数的问题

matrix(unlist(your_result), ncol = 3, dimnames = list(1:5, c('mean', 'sd', '
max')))

w*******9
发帖数: 1433

来自主题: Statistics版 - 问一下R的读取数据问题

unlist() will do the job

p*****n
发帖数: 265

来自主题: Statistics版 - 问一下R的读取数据问题

谢谢，output是
[1] "data.frame"
是这个问题吗？不过我用 unlist（）就好了，呵呵

t*****w
发帖数: 254

来自主题: Statistics版 - 请问面试 R 应该怎么准备？

When I had my job interview, they always tested my SAS skill.However I use R
all the time. To help your preparation, read my R codes to see how much you
can understand it.
%in%
?keyword
a<-matrix(0,nrow=3,ncol=3,byrow=T)
a1 <- a1/(t(a1)%*%spooled%*%a1)^.5 #standadization in discrim
a1<- a>=2; a[a1]
abline(h = -1:5, v = -2:3, col = "lightgray", lty=3)
abline(h=0, v=0, col = "gray60")
abs(r2[i])>r0
aggregate(iris[,1:4], list(iris$Species), mean)
AND: &; OR: |; NOT: !
anova(lm(data1[,3]~data1[,1... 阅读全帖

d*******7
发帖数: 118

来自主题: Statistics版 - 求教一个code function，怎样把string前面的几个数字去掉

s<-"123ABC45"
m<-unlist(strsplit(s,""))
paste(m[grep("[A-Z]",m)[1]:nchar(s)],collapse="")

123ABC45

f***8
发帖数: 571

来自主题: DataSciences版 - 问个R的问题

"任取一个数据，找出和它最‘相似’的10个数据，"这个好办，用apply就可以，比如：
sort(apply(mtcars[-1, ], 1, function(x) cor(x, unlist(mtcars[1, ]))),
decreasing=TRUE)[1:10]
至于categorical的数据如何处理，需要自己定义一个cor 函数.

p****r
发帖数: 46

来自主题: DataSciences版 - 板上R高手多，包子求R数据输出到CSV方法

# create matrix from applist, then transpose it
# so the matrix is N rows * 10 columns
app <- t(data.frame(applist))
# Same for scorelist
score<- t(data.frame(scorelist))
# generate column sequence (1,11,2,12...10,20) so as to reorder them after
cbind
cols <- rep(1:10,each=2)+rep(c(0,10),10)
# or you can do cols <- unlist(sapply(1:10,function(x) list(x,x+10)))
data <- cbind(app,score)
# reorder columns
data <- data[,cols]
# generate col_names: "applist1", "scorelist1", "applist2","scorelist2"...... 阅读全帖

Y****a
发帖数: 243

来自主题: DataSciences版 - 请教问题 R list of list to vector to data frame

my guess is:
tmpTestIn@data@data@Dim[2] = 50
topn = 1000
Q1: length(v2) = 50, length(v2[[1]]) = 1000
v2 <- unlist(v2)
Q2: assume now your create df returns a data.frame of dim 50,000 x 7
dfall <- do.call('rbind',lapply(range of i, function(i) create df)
Q3:
dfall$v8 <- F(v4, v5, v6)
dfall$v9 <- 0
for (u in userIDs) {
for (i in itemIDs) {
idx <- which((dfall$userID == u) & (dfall$itemID == i))
dfall$v9[idx] <- order(dfall$v8[idx])
}
}

l****r
发帖数: 21884

来自主题: _ChenChuSheng版 - youtube上看山楂树之恋

2:
http://www.youtube.com/watch?v=pGnkMMJ73gI
3.
http://www.youtube.com/watch?v=hso-2eHWFEg
4.
http://www.youtube.com/watch?v=2TN0R4iNilE
11.
http://www.youtube.com/watch?v=WGFKdMECGwo
其他的在华人上面也没有贴出来, 我看这些video都是unlisted, 也就是说搜索不到的,
必须有人给链接才行, 希望华人那个楼主继续给出链接吧.
华人原帖:
http://www.huaren.us/dispbbs.asp?boardid=358&Id=811051

w*******y
发帖数: 60932

来自主题: _DealGroup版 - 【$】Ebay 8% Ebay-Bucks on Gift Cards Today - No Bill Me Later Needed

Ebay is running a triple points offer for today 12/13 only. Additionally
there is a double points for gift cards through gift mall. For some reason
they are doing an unlisted combo on the gift cards giving you 4X Ebay points
(though it's not 6X points which would be 12%). I turns out to be 8% in
Ebay bucks.
Link:
http://stores.ebay.com/giftcardmall
I received $25 gift cards to Rock bottom with $2 Ebay Bucks for each one.
Combine with Discover Card 2% cashback for internet purchases through the
e... 阅读全帖

D**p
发帖数: 293

来自主题: _Stockcafeteria版 - 美国经济回顾与预测---18点摘要

nothing wrong being a frog.
As you correctly pointed out, it should be 19 ponts, rather than 18. There
are also some other unlisted points. But I believe those 18 points are the
most important ones. Without this 3.5% thing, there still might be a second
ressession, but the chance is smaller. It might be carefully managed into an
L-shapped recovery. However, I do not see a real significant difference
between an L-shpped recovery and a small double-dip. In some people's mind,
an L-shpped recovery

l*********y
发帖数: 3447

来自主题: _Xiyu版 - 以前想给小杨做个宣传片

sorry, change to unlisted

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

topics

未名新帖统计// 7月16日

历史上的今天