由买买提看人间百态

topics

全部话题 - 话题: data1
1 2 3 下页 末页 (共3页)
t********m
发帖数: 939
1
【 以下文字转载自 DataSciences 讨论区 】
发信人: tulipdream (xiaohuaidan), 信区: DataSciences
标 题: 请教一个R问题:怎么rbind一系列data,如data1,data2,....data1000
发信站: BBS 未名空间站 (Thu Nov 13 17:35:36 2014, 美东)
有没有一个简单点的方法,类似于rbind(data1-data1000)的,求各位不吝赐教,谢谢
k*******a
发帖数: 772
2
如果data1-data1000在一个list的话,就用do.call函数
否则可以试试
mycall = c("rbind", paste0("data", 1:1000))
mycall = lapply(mycall, as.name)
result = eval(as.call(mycall))
d********t
发帖数: 837
3
Reduce(rbind, list(data1,data2,data3,data4))
example:
Reduce(rbind, list(data.frame(x1=c(1,2,3),x2=c(2,3,4)),data.frame(x1=c(5,6,7
),x2=c(7,5,4)),data.frame(x1=c(5,4,3),x2=c(7,6,5))))
t********m
发帖数: 939
4
有没有一个简单点的方法,类似于rbind(data1-data1000)的,求各位不吝赐教,谢谢
t**********r
发帖数: 182
5
Many thanks for your hint!! I made it. Here is the code.
579 proc sql;
580 create table data3 as
581 select data1.*, data2.rating, day1-day2 as diff
582 from data1, data2
583 where data1.var1=data2.var1 and data1.var2=data2.var2
584 and date1-date2>0
585 group by data1.var1, data1.var2, data1.date1
586 having diff=min(diff);
NOTE: The query requires remerging summary statistics back with the original
data.
NOTE: Table WORK.data3 created, with 48144 rows and 9
t**********r
发帖数: 182
6
Has figured it out. Thanks.
===============
579 proc sql;
580 create table data3 as
581 select data1.*, data2.rating, day1-day2 as diff
582 from data1, data2
583 where data1.var1=data2.var1 and data1.var2=data2.var2
584 and date1-date2>0
585 group by data1.var1, data1.var2, data1.date1
586 having diff=min(diff);
NOTE: The query requires remerging summary statistics back with the original
data.
NOTE: Table WORK.data3 created, with 48144 rows and 9 columns.
587 quit;
NOTE: PROCEDURE SQL used (Tota
t**********r
发帖数: 182
7
Want to merge two data sets using proc sql:
Data1:
var1 var2 date1
Data2:
var1 var2 date2 rating
(Note: var1 and var2 are the same variables in these two data sets)
Question:
Select rating in data2 to data1; meeting the following criteria:
1. date1 - date2 >0
2. date1 - date2 has the minimum value.
I wrote the following code; but it won't work:
proc sql;
create table data3 as
select data1.*, data2.rating, date1-date1 as diff
from data1, data2
where data1.var1=data2.var1 and data1.var2=data2.var2
p******p
发帖数: 13
8
来自主题: Statistics版 - 请问一个SAS proc sql的写法
亲测可用,想覆盖data1的话把最后的new_data1改成data1就好,虽然覆盖源dataset习
惯很不好。
data data1;
input id t1 t2;
datalines;
1 2 3
2 4 5
3 4 5
;
run;
data data2;
input id;
datalines;
1
2
;
run;
proc sql noprint;
create table new_data1 as
select data1.*,coalesce(flag,0) as flag from
data1 left join (select data2.*,1 as flag from data2)
on data1.id=data2.id
;
quit;
t**********r
发帖数: 182
9
Want to merge two data sets using proc sql:
Data1:
var1 var2 date1
Data2:
var1 var2 date2 rating
(Note: var1 and var2 are the same variables in these two data sets)
Question:
Select rating in data2 to data1; meeting the following criteria:
1. date1 - date2 >0
2. date1 - date2 has the minimum value.
I wrote the following code; but it won't work:
proc sql;
create table data3 as
select data1.*, data2.rating, date1-date1 as diff
from data1, data2
where data1.var1=data2.var1 a
y******0
发帖数: 401
10
proc sql;
create table data3 as
select data1.var1,data1.var2, data2.rating, min(date1-date2) as diff
from data1, data2
where data1.var1=data2.var1
and data1.var2=data2.var2
and date1>date2
group by 1,2,3;
quit;
s*****n
发帖数: 2174
11
来自主题: Statistics版 - 今天又“R”了 -- 感想和请教。
1. names(data)[1] <- "newname" 就可以, 如果你不喜欢用数字index, 也可以这样
names(data)[names(data)=="var1"] <- "newname" 或者
names(data) <- gsub("var1", "newname", names(data)) 都可以
2. 你说那个有个条件, 就是BY variable必须是相同的. 考虑如果data1, data2,
data3之间做一个merge. data1和data2之间用var1和var2来做index match, 而data1和
data3之间用var3来做index match. 反正就是这种比较复杂的merge, 每个data之间的
BY variable都不确定. 很难定义一个函数来handle多个data, 除非这个函数本身提供
很多很多参数.
3. 除了SAS, 还有别的语言有你说的这种"最近的data"的概念吗?
是最近一个赋值(写)的, 还是最后一个取值(读)的? 比如
data3 <- merge(data1, data2)
print(data2
k*****u
发帖数: 1688
12
来自主题: Statistics版 - 如何强行合并两个datasets?
正解
昨天网上刚刚看了
union合并了,data1的变量比data2多,那么变量名都是data1的
要是data2的变量比data1多,那么前面的用data1变量名,后面用data2变量名
y*****w
发帖数: 1350
13
It seems survreg() in R and PROC LIFEREG in SAS run the same type of
survival analysis. However, when I ran both of them on a survival data, I
got different results. Both were set as exponential distribution, and have
right censored data. See below. Could anybody tell me why the results are
different? Did I miss specifying any important parameters in R? Thanks!
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The R code:
survFit <- survreg(Surv(time, event, type="right") ~... 阅读全帖
s*****n
发帖数: 2174
14
来自主题: Statistics版 - R,用apply比用for loop 快?
R当初是为了run一些统计的东西设计的, 主要考虑的是方便, 所以很多object
handling都过于flexible以至于牺牲了performance. 举个最简单的例子, 就说data
frame吧, 其实就是个很慢的东西. 很多时候data manipulation的时候, 用matrix会快
的多.
data1 <- matrix(rep(0, 1e6), ncol = 1000, nrow = 1000)
data2 <- data.frame(data1)
object.size(data1)
object.size(data2) # almost same size
# 0.42秒
for (i in 1:1000){
for (j in 1:1000){
data1[i,j]
}
}
# 33.2秒
for (i in 1:1000){
for (j in 1:1000){
data2[i,j]
}
}
这几乎是100倍的速度差别, 就仅仅是data frame vs matrix. 两个例子里面都有双层
for loop. 其实fo... 阅读全帖
s*******s
发帖数: 1031
15
来自主题: JobHunting版 - 我的几个面试算法解答。
follow一下我的面经。
http://www.mitbbs.com/article_t/JobHunting/32517841.html
整理了我的几个解答的算法,分享一下。欢迎批评指正。
多谢!
1. 写一个程序,找出 5^1234566789893943的从底位开始的1000位数字。
我用的递归+数组大数乘法。
// Caclulate (m^n)%(10^k). Keep the k integer numbers in an array.
// Note: the integer numbers are in reversed in the array
// Assume: m>0, n>0, k>0
// Need to check validity outside of this function.
// call calculate(5, 1234566789893943, 1000) to get result.
// Time complexity: O((log n) * k * k)
// Space complexity: O((log n) * k)
ve... 阅读全帖
s*******s
发帖数: 1031
16
来自主题: JobHunting版 - 我的几个面试算法解答。
follow一下我的面经。
http://www.mitbbs.com/article_t/JobHunting/32517841.html
整理了我的几个解答的算法,分享一下。欢迎批评指正。
多谢!
1. 写一个程序,找出 5^1234566789893943的从底位开始的1000位数字。
我用的递归+数组大数乘法。
// Caclulate (m^n)%(10^k). Keep the k integer numbers in an array.
// Note: the integer numbers are in reversed in the array
// Assume: m>0, n>0, k>0
// Need to check validity outside of this function.
// call calculate(5, 1234566789893943, 1000) to get result.
// Time complexity: O((log n) * k * k)
// Space complexity: O((log n) * k)
ve... 阅读全帖
y***n
发帖数: 1594
17
原题在这里。 http://www.mitbbs.com/article_t/JobHunting/32517841.html 但是觉得不太对
搞这些题觉得很没思路。
// Caclulate (m^n)%(10^k). Keep the k integer numbers in an array.
// Note: the integer numbers are in reversed in the array
// Assume: m>0, n>0, k>0
// Need to check validity outside of this function.
// call calculate(5, 1234566789893943, 1000) to get result.
// Time complexity: O((log n) * k * k)
// Space complexity: O((log n) * k)
vector calculate(unsigned long m, unsigned long n, int k) {
if(... 阅读全帖
m**********2
发帖数: 2252
18
来自主题: Database版 - 周末包子谢帖及请教高手
有这样一个table,想直接select出不同ID的data1,data2。。。的sum,该怎莫做?
原来的:
ID Data1 Data2 Data3
1 3341 1926 336
1 972 562 130
1 1580 760 213
1 742 345 87
1 7776 4237 889
2 27129 15347 3002
2 1428 825 139
2 3211 1918 361
2 1114 598 114
2 1219 602 110
3 1238 684 127
select 之后
ID Data1 Data2 Data3
1 14411 7830 1655
2 34101 19290 3726
3 1238 684 127
谢!!!
o*****l
发帖数: 539
19
来自主题: Linux版 - linux mount question
请教大侠们一个问题, 谢谢!
I create a linux EC2 instance, with /dev/sda, /dev/sdb, /dev/sdc.
How can I know /dev/sdb and /dev/sdc is mounted?
If not, how to mount it?
I tried create /mnt/data1, then do mount, fail.
$ sudo mount -t ext4 /dev/xvdb /mnt/data1
mount: wrong fs type, bad option, bad superblock on /dev/xvdb,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so
[ec2-user@ip-172-31-3-14 data1]$ dmesg | tail
EXT... 阅读全帖
z****u
发帖数: 23
20
在matlab中如何运行不在current working directory中的script?
比方说现在的working directory是"C:\Program\Files\MATLAB\R2007b\work", 在\
work下面有一些文件夹,比方说有\work\data1,\work\data2等等,里面都有一些
script,比方是\work\data1\code1, \work\data2\code2。能在\work里写个script,
运行\work\data1\code1, \work\data2\code2吗?
多谢多谢!
s******e
发帖数: 2181
21
谢谢解惑,我的问题是这样的,这是一个被我简化后的在matlab环境下运行的C程序。
我想测试我输入的数据是否被正确读入。一个途径是通过把输入参数赋给输出的地址,
来得到结果,看来是正确的。另一个是通过printf把输入数据直接打出来,可是这个打
出来的数字始终为0,无论采用printf("input=%dn", data2[0]); 还是printf("input=
%dn", *data2);都一样。真是见鬼了,那data2这个地址上存的数据到底是什么
#include "mex.h"
#include "gpu/mxGPUArray.h"
#include "cuda.h"
void mexFunction(int nlhs, mxArray *plhs[],
int nrhs, mxArray const *prhs[])
{
double *data1; *data2;
int m,n;
m=mxGetM(prhs[0]);
n=mxGetN(prhs[0]);
plhs[0]=mxCreateDoubl... 阅读全帖
c**********e
发帖数: 2007
22
来自主题: Statistics版 - How to the macro regression with if?
Suppose that I have a data set data1, with numerical variables
x, y, z. I would like to do regression y=x if a macro
variable is "a" and do z=x y if the macro variable is not
"a". The following does not work well. How to do it?
%macro regre(var);
%if "&var."="a" %then %do;
proc reg data=data1;
model y=x;
run;
%end;
%else %do;
proc reg data=data1;
model z=x y;
run;
%end;
%mend;
%regre(a);
%regre(b);
s*****n
发帖数: 2174
23
1. 用 is.element 很容易搞定.
data1[!is.element(paste(data1$key1, data1$key2), paste(data2$key1, data2$
key2)), ]
2. 直接tapply就好了, 根本用不着什么new啊, index的.
tapply(data$z, list(data$x, data$y), sum)
y****2
发帖数: 34
24
来自主题: Statistics版 - 再问R的问题 - 关于matrix 的operation
data <- matrix(c(1 ,1 ,2, 2, 1, 3, 4, 2,1, 5, 6, 3,2, 7, 8, 3,2, 9, 10, 4),
ncol=4, byrow=T)
colnames(data) <- c("id", "x1", "x2", "e")
### step1:
data[,2:3] <- data[,2:3]*data[,4]
data1 <- data[,1:3]
### step2:
data2 <- aggregate(data1[,2:3], list(id=data1[,1]), sum)
### step3:
data3 <- split(data2[,2:3], f=list(data2[,1]))
data3 <- lapply(data3, as.vector, "numeric")
mprod <- function(x){x %*% t(x)}
data4 <- lapply(data3, mprod)
### step4:
data5 <- 0
for(i in 1:length(data4)){
data5 <- data5 +
c*****t
发帖数: 1712
25
来自主题: Statistics版 - SAS code help
data x;
input x $ y $ z;
datalines;
a a 1
a a 2
b a 2
a c 1
;
run;
proc sort data=x; by x y;run;
proc means data=x noprint;
var z;
by x y;
output out=data1(drop=_type_ _freq_) sum=;
run;
proc sort data=data1; by y z;run;
data data2;set data1;by y; if last.y;run;
z***9
发帖数: 1052
26
来自主题: Statistics版 - SAS Question 请教
我有一个data1
ID Description
1 aadd
2 adsd
3 asdd
....
现在我又有了一个data2
ID Description
1 aaddd
2 adsdq
3 asddg
4 fdsfg
....
我希望用data2里的Description来替换data1里的Description,在ID相等的情况下.
我想到的笨办法就是做个left join,新建一个table.有没有什么fancy的方法直接
update data1 里的Description到data2的Description.
t**c
发帖数: 539
27
来自主题: Statistics版 - SAS Question 请教
Using MODIFY statement.
Let data1 be the master data set and data2 be the transaction-data-set:
DATA data1;
Modify data1 data2;
By ID;
RUN;
t*****w
发帖数: 254
28
来自主题: Statistics版 - 请问面试 R 应该怎么准备?
When I had my job interview, they always tested my SAS skill.However I use R
all the time. To help your preparation, read my R codes to see how much you
can understand it.
%in%
?keyword
a<-matrix(0,nrow=3,ncol=3,byrow=T)
a1 <- a1/(t(a1)%*%spooled%*%a1)^.5 #standadization in discrim
a1<- a>=2; a[a1]
abline(h = -1:5, v = -2:3, col = "lightgray", lty=3)
abline(h=0, v=0, col = "gray60")
abs(r2[i])>r0
aggregate(iris[,1:4], list(iris$Species), mean)
AND: &; OR: |; NOT: !
anova(lm(data1[,3]~data1[,1... 阅读全帖
r*******b
发帖数: 78
29
来自主题: JobHunting版 - 一道面试题求解
遇到一道简单的面试题,不太明白。
编写一个program
对一个input file,有如下内容:
blabla
_a_ val1 data1
_a_ val2 data2
_a_ val3 data3
最后要得到一个outputfile
val1, val2, val3
data1, data2, data3
希望你直接输入
./program output
语言随便你选。
小弟一直用c语言写代码。
请高手指导一下,是不是用python写完之后,用bash?
我的想法是matlab,然后转换成script,不知道行不行
谢谢赐教!
c**y
发帖数: 419
30
来自主题: Stock版 - 用MultiCharts做Long/Short交易
MultiCharts可以引用2个股票的数据, 所以可以实现简单的2个股票的long/short交易.
如果第二个股票用大盘指数ETF或者行业ETF的话, 就变成了alpha/beta分离交易.
引用方法是建立指标公式 spread=close of data1 - beta * close of data2 ; 这样
这个spread就可以被图形化监控了.
由于有下面这个近似公式: (1+x)/(1+y)=1+(x-y)
x, y 分别是股票1,2的期间return%
所以一般beta取1的话,我们就直接用spread=close of data1 / close of data2. 这样
比较方便.
下图是AAPL(图1) 对RSP(等权重SP500指数ETF, 图2)的long/short对的比率(图3中红色
), 和2%的跟踪止损(图3中黄色).
l*******c
发帖数: 523
31
如果有个t的函数:
V(t) = sqrt { (Va)^2 + [(Vb)^2 - (Va)^2]*[1-e^(kt/20)]/(1-e^T)};
Va是已知的值;
Vb也是已知的值;
k, T是常数。
然后k从1到20,有没有简单的算法可以减少for循环里运算量?以下是那段code:
if (Vb > Va)
deltaDAC = (Vb * Vb - Va * Va)/(1-exp(T));
else
deltaDAC = (Va * Va - Vb * Vb)/(1-exp(T));

if (Vb > Va)
{
for (k=1; k <= 20; k++)
{
Data1 = Va * Va;
Data2 = deltaDAC * (1-exp(k/20));
DAC_Set_Value = sqrt(Data1 + Data2);
DAC_Outp... 阅读全帖
z***e
发帖数: 5393
32
就是要在production上设一个debug flag,然后把所有细节都能log下来。
问题是这个所有细节就有很多,而且很多步。
本来简单的做法就是
void debug(boolean debugFlag, String message) {
if (debugFlag) {
log(message);
}
}
那么caller就会有很多种这种:
debug(flag, "Data1: "+ ...+....+...);
debug(flag, "Data2: "+ ...+....+...);
debug(flag, "Data3: "+ ...+....+...);
但是如果没有设置debugflag,我不知道"Data1: "+ ...+....+...这种string concat
会不会执行。如果是C/C++的话,这种直接inline就好了,但是java又不能inline。
我把它改成:
void debug(boolean debugFlag, String... message) {
...

然后在里面再去concat string,这样是不是好点?
l*******c
发帖数: 523
33
如果有个t的函数:
V(t) = sqrt { (Va)^2 + [(Vb)^2 - (Va)^2]*[1-e^(kt/20)]/(1-e^T)};
Va是已知的值;
Vb也是已知的值;
k, T是常数。
然后k从1到20,有没有简单的算法可以减少for循环里运算量?以下是那段code:
if (Vb > Va)
deltaDAC = (Vb * Vb - Va * Va)/(1-exp(T));
else
deltaDAC = (Va * Va - Vb * Vb)/(1-exp(T));

if (Vb > Va)
{
for (k=1; k <= 20; k++)
{
Data1 = Va * Va;
Data2 = deltaDAC * (1-exp(k/20));
DAC_Set_Value = sqrt(Data1 + Data2);
DAC_Outp... 阅读全帖
V********n
发帖数: 3061
34
那你把下面三步合为一步,可以减少两次生成中间变量并寻址的动作,对减少时间有帮
助:
Data1 = Va * Va;
Data2 = deltaDAC * (1-exp(k/20));
DAC_Set_Value = sqrt(Data1 + Data2);
改成:
DAC_Set_Value = sqrt(Va * Va + deltaDAC * (1-exp(k/20)));
说实在的,现在的编程上很少会需要去考虑这么细微的区别。如果你真的对时间抠到纳
秒的地步,这么做也许会有点帮助。

ringing.
l*******c
发帖数: 523
35
如果有个t的函数:
V(t) = sqrt { (Va)^2 + [(Vb)^2 - (Va)^2]*[1-e^(kt/20)]/(1-e^T)};
Va是已知的值;
Vb也是已知的值;
T是常数。
然后k从1到20,有没有简单的算法可以减少for循环里运算量?以下是那段code:
if (Vb > Va)
deltaDAC = (Vb * Vb - Va * Va)/(1-exp(T));
else
deltaDAC = (Va * Va - Vb * Vb)/(1-exp(T));

if (Vb > Va)
{
for (k=1; k <= 20; k++)
{
Data1 = Va * Va;
Data2 = deltaDAC * (1-exp(k/20));
DAC_Set_Value = sqrt(Data1 + Data2);
DAC_Output(... 阅读全帖
q**j
发帖数: 10612
36
来自主题: Statistics版 - 今天又“R”了 -- 感想和请教。
1. 问题不是不能。而是不方便。names(data.frame)一次要全部改变。如果有20个变量
怎么办?一般人会觉得太麻烦吧?如果有 names(data.frame$var1) = "newname"。多
好。
2. SAS里面有in=option.可以解决很多问题。in=0,1。刚好2^n个。而且SAS比较谦虚。吧sql也用上了。这个
让用户很方便。R为什么不考虑兼容sql呢?
3. R完全可以:如果attach(data1),data1就是defualt。如果没有attach(),default attach
最近一个用过的。挺简单的一件事。
另外能不能问 tapply的时候能不能同时分析好几个column?
state <- c("tas", "sa", "qld", "nsw", "nsw", "nt", "wa", "wa",
"qld", "vic", "nsw", "vic", "qld", "qld", "sa", "tas",
"sa", "nt", "wa", "vic", "qld", "nsw", "nsw", "wa",
"sa", "act", "ns
g*******y
发帖数: 380
37
来自主题: Statistics版 - SAS应用问题
说实在的,不懂你在说什么?
第一,你所指的保留值是什么?是指你在data1里提取的count的值为最大的组里对应的
num1,num2吗?
第二,不懂你为什么要想那么多复杂的方法?因为从你的sample来看,你的每个组只是根
据count的值来进行重复,count是多少,每个组重复的观察值就是几个.你如果只是想要
count最大组里的num1,num2来剔取data2里的观察值的话.我觉得最简单的方法:
proc sort data=data1; by descending count;
用最大count组里的值,比如x,y来剔除你不要的观察值.
data temp; set data2; if num1=x and num2=y;
如果这个不是你想要的,那么也许我没有看懂你的问题.
a****m
发帖数: 693
38
谢谢,主要是行数太多,前62个数(row wise)是一组,如果matrix 置换会把所有的
行弄混,结果txt转换成cvs时候,把空格给变成0了,用genelowvalfilter 好像也不能
去掉,只好这样处理了:
谢谢
[n,p]=size(data);
for i=1:2000
gene(i,:)=reshape(data(((i-1)*16+1):16*i,1:4),1,64);
end
data1=gene(:,1:47);
data2=gene(:,49:63);
data=[data1,data2];
这个filter 不起作用。
[mask,genes] = genelowvalfilter(gene,'absval',1);
i**f
发帖数: 1195
39
不用看文字,看data就好了
把data1的格式变成data2,加上ADDL和II(additional dose的数量和dosing interval)
data1
D Date Time Event Glucose
101 01Jan2000 7:30 Sampling .
101 01Jan2000 8:00 Dosing .
101 01Jan2000 8:30 Sampling 100
101 01Jan2000 9:30 Sampling 200
101 01Jan2000 12:30 Sampling 170
101 01Jan2000 18:30 Sampling 140
101 02Jan2000 7:30 Sampling 90
101 02Jan2000 8:00 Dosing .
101 03Jan2000 8:00 Dosing .
101 04Jan2000 8:00 Dosing .
101 05Jan2000 8:00 Dosing .
data2
l******1
发帖数: 86
40
来自主题: Statistics版 - SAS解方程,如何限制根的取值
想用SAS解一个三次的方程,但是希望根的取值在0和1。这可怎么弄啊。
code如下。
data data1;
input p_ta;
datalines;
0.975
;
proc model data=DATA1;
eq.a=(1-ta)**3+3*ta*(1-ta)**5-p_ta;
solve ta /onepass solveprint;
run;
s*****n
发帖数: 2174
41
其实就是一个循环, 循环里面包含一个判断. 实现的话在R里也就十几行.
data <- read.table(...)
result <- data.frame(try = 1:1000, output = NA, case = NA)
for (i in 1:1000){
data1 <- data[sample(100000, 10000), ]
data2 <- data[sample(100000, 10000), ]
if (mean(data1$var1) > 0){
fit1 <- lm(...)
result$output[i] <- functionA(data2, fit1$parameter_a)
result$case[i] <- "A"
} else {
fit2 <- glm(...)
result$output[i] <- functionB(data2, fit2$parameter_b)
result$case[i] <- "B"
}
}
hist(result$output[
w********5
发帖数: 72
42
来自主题: Statistics版 - 请教一下SAS编程的一个问题
This is my answer. My codes are alway very long and not efficient. Please
help simlify.
data data1;
input var1;
cards;
5
6
;
run;
data data2;
input var2;
cards;
5
6
;
run;
data new;
infile datalines dlm=" ";
input name $ var $ ;
datalines;
data1 var1
data2 var2
data2 var2
data4 var4
;
run;
proc sql;
select name into:name1-:name&SYSMAXLONG
from new;
select var into:col1-:col&&SYSMAXLONG
from new;
quit;
%put _user_;
option mprint mlogic;
%macro mutiple;
%do i=1 %to &sqlobs;
proc so
s********l
发帖数: 245
43
来自主题: Statistics版 - help need for SAS macro
The code I wrote is:
%macro data(num);
%do i=0 %to #
data est#
infile "path\data&num";
input a b c d;
run;
proc append base=data1 data=est#
run;
%mend;
%data(num=100);
through above program, I just got data combine with the data1 and the
data100. What's wrong with my program? I really need help from you! Many
thanks.
s********p
发帖数: 637
44
She is pulling data from large tables, in most case, she will create new
dataset containing pulled variable for further analysis.
I am not clear if temporary will be created when no new dataset needed. I
don't use system option(fullstimer), but just tried the following,
proc sql;
select p1.id from
data1 p1
inner join
data2 p2 on
p1.id=p2.id
;
quit;
and check if temporary files generated and found "#tf0005.sas7butl" created and size changed.
ll /saswork/SAS_workDD3300003446
total 7000
-rw-rw-r-- ... 阅读全帖
c*****m
发帖数: 4817
45
来自主题: Statistics版 - R data.frame
if the width is fixed, then you can use read.fwf, for example
> data1 = read.fwf("M:/test.txt", widths=c(1,1,1,1,1))
> data1

V1 V2 V3 V4 V5
1 A B C D E
2 A B C D E
3 A B C D E

I
k*****u
发帖数: 1688
46
来自主题: Statistics版 - 借人 气问 两个 问题:
1:sas用data1建模,建模得到的参数,怎么用到date2里做预测?我想到的土办法是把
参数记下来。然后在data step里面搞。实际应该怎么弄才好?
2:data1的变量x=CA,OR,WA,我把x弄成dummy variable,但是data2里面x除了上面3
个值,还有NV,TX。这样的话这个x还放到模型里面么?还是直接就去掉了?实际当中
一般怎么处理的?
f******u
发帖数: 250
47
来自主题: Statistics版 - 诚心求教一个macro问题 很费解
you do not need a macro.
data data1(keep=x);
input variableName $14. variableLength : $12. price;
x=catx(' ', variableName,variableLength,price);
datalines;
FundingAmount 0-15000 0.217
ObjectAmount 15000-30000 0.318
FundingAmount 30000-60000 0.519
;
run;
data data2(keep=x);
input firstname : $7.lastname : $7. age;
x=catx(' ',firstname,lastname,age);
datalines;
Ashley Liu 17
brandon green 30
susan Chen 28
run;
data data3;
set data1 data2;
run;
data _null_;
fil... 阅读全帖
A****1
发帖数: 33
48
来自主题: Statistics版 - SAS 高手请帮忙
If only a few outliers, you can use the annotate facility.
data my_labels;
retain xsys ysys '2' function 'label' position '1' style "'Arial/bo'"
color 'blue';
set data1;
if read>75 then do;
text='Value1'; x=write; y=read; output;
end;
run;
proc gplot data=data1;
plot read*write/annotate=my_labels;
run;
quit;

.
1 2 3 下页 末页 (共3页)