k******u 发帖数: 250 | 1 数据为csv file
如下:
123,"Harold Wilson",Acct,01/15/1989,$78,123.
128,"Julia Child",Food,08/29/1988,$89,123
007,"James bond",Security,02/01/2000,$82,1000
828,"Roger Doger",Acct,08/15/1999,$39,100
900,"Earl Davenport",Food,09/09/1989,$45,399
906,"James Swindler",Acct,12/21/1978,$78,200
comma是delimiter.
写了程序如下,
data Employ;
infile 'C:employee.txt' dsd;
input ID : $3.
Name : $20.
Depart : $8.
DateHire : mmddyy10.
salary : dollar8.
;
run;
proc print data = employ;
format DateHire mmddyy10.
salary dollar8.
;
run;
但是读取时,最后一个salary是包含“,”,所以读的时候只读到前面2位就停了。这
种情况该如何处理呢? | d******9 发帖数: 404 | 2 It is very annoying to deal with flat files.
One solution is to read in all the other variables using your code EXCEPT
the Salary. And use _N_ to tag the number of observations.
Then use below codes to read in the Salary column: Read the whole record as
one variable, then extract the dollar amount.
data Salary;
length raw $200;
infile cards missover ;
input raw : & $1-200;
N=_N_;
I=find(Raw, '$');
S= substr(Raw, I);
Salary=input(S, dollar12.);
cards;
123,"Harold Wilson",Acct,01/15/1989,$78,123.
128,"Julia Child",Food,08/29/1988,$89,123
007,"James bond",Security,02/01/2000,$82,1000
828,"Roger Doger",Acct,08/15/1999,$39,100
900,"Earl Davenport",Food,09/09/1989,$45,399
906,"James Swindler",Acct,12/21/1978,$78,200
;
run;
Then merge the 2 datasets together by N ! | d******9 发帖数: 404 | 3 A better method:
You can read in the value of Salary into 2 separate vars, then concatenate
them together, and convert it into numeric by Input function.
data Employ;
infile cards dsd missover ;
input ID : $3.
Name : $20.
Depart : $8.
DateHire : mmddyy10.
S1 : $8. S2 : $8. ;
S=strip(S1) || strip(S2);
Salary=input(S, dollar12.);
cards;
123,"Harold Wilson",Acct,01/15/1989,$78,123.
128,"Julia Child",Food,08/29/1988,$89,123
007,"James bond",Security,02/01/2000,$82,1000
828,"Roger Doger",Acct,08/15/1999,$39,100
900,"Earl Davenport",Food,09/09/1989,$45,399
906,"James Swindler",Acct,12/21/1978,$78,200
;
run; | t*****w 发帖数: 254 | 4 the second method is very intuitive. Just add (drop = s1 s2 s) after data
employ. Output looks better. | d******9 发帖数: 404 | 5 Yes.
I did not drop the 3 intermediate vars, just want to show the process.
【在 t*****w 的大作中提到】 : the second method is very intuitive. Just add (drop = s1 s2 s) after data : employ. Output looks better.
| S*x 发帖数: 705 | 6 This is good assuming that no one's salary is greater than $1M
In the case of some salary columns containing 2 or more commans(yep, those 7
figures)
Your first method would work better
【在 d******9 的大作中提到】 : A better method: : You can read in the value of Salary into 2 separate vars, then concatenate : them together, and convert it into numeric by Input function. : data Employ; : infile cards dsd missover ; : input ID : $3. : Name : $20. : Depart : $8. : DateHire : mmddyy10. : S1 : $8. S2 : $8. ;
| d******9 发帖数: 404 | 7 To the point, tks.
7
【在 S*x 的大作中提到】 : This is good assuming that no one's salary is greater than $1M : In the case of some salary columns containing 2 or more commans(yep, those 7 : figures) : Your first method would work better
|
|