k******w 发帖数: 269 | 1 firm year
A 1990
A 1991
A 1993
b 1978
b 1979
b 1980
.. ...
I want to filter that each firm has at least consecutive 2 years, otherwise
just delete.
how to make it?
thanks |
R*********i 发帖数: 7643 | 2 Do you only want the firm names? It can be done in a few steps. I'm not a
good programmer to have all done in one step. :-)
proc sort data=test out=test1;
by firm year;
run;
*-- Method 1---*;
data test2;
set test1;
by firm year;
retain lastyr;
if first.firm then lastyr=year;
else do;
if lastyr+1=year then flag=1;
lastyr=year;
end;
run;
proc sort data=test2 out=tokeep (keep=firm) nodupkey;
by firm;
where flag;
run;
*-- Method 2---*;
proc transpose data = test1 out = ttes |
k******w 发帖数: 269 | 3 Thanks.It is useful
how about 3 or 4 consecutive year?
【在 R*********i 的大作中提到】 : Do you only want the firm names? It can be done in a few steps. I'm not a : good programmer to have all done in one step. :-) : proc sort data=test out=test1; : by firm year; : run; : *-- Method 1---*; : data test2; : set test1; : by firm year; : retain lastyr;
|
R*********i 发帖数: 7643 | 4 Similarly, in Method 2, modify the do loop for 3 years:
do i=1 to dim(year)-2;
if year[i]+2=year[i+1]+1=year[i+2] then flag=1;
end;
I'll leave the 4 years to yourself
【在 k******w 的大作中提到】 : Thanks.It is useful : how about 3 or 4 consecutive year?
|
w********5 发帖数: 72 | 5 very good.
【在 R*********i 的大作中提到】 : Similarly, in Method 2, modify the do loop for 3 years: : do i=1 to dim(year)-2; : if year[i]+2=year[i+1]+1=year[i+2] then flag=1; : end; : I'll leave the 4 years to yourself
|
S******y 发帖数: 1123 | 6 #Python 2.6
txt ='''
firm year
A 1990
A 1991
A 1993
b 1978
b 1979
b 1980
c 2001
c 2009
d 2004
d 2006
e 1999
f 1990
f 1997
f 1999'''
txt_lst = txt.split('\n')
txt_lst.remove('')
txt_lst.remove('firm year')
def checck_consecutive(ls):
length = len(ls)
for index, item in enumerate(ls):
if index > 0:
if abs(int(item) - int(ls[index-1]))==1:
return True
return False
ls = []
prev_firm_nm = ''
for line in txt_lst:
firm_nm, |
s*********e 发帖数: 1051 | 7 data one;
input firm $1. year;
datalines;
A 1990
A 1991
A 1993
b 1978
b 1979
b 1980
c 2001
c 2009
d 2004
d 2006
e 1999
f 1990
f 1997
f 1999
;
run;
proc sql;
create table
two as
select
*
from
one
where
firm in
(select
distinct a.firm
from
one as a, (select * from one group by firm having year > min(year)) as b
where
a.firm = b.firm and a.year + 1 = b.year
);
quit; |
S******y 发帖数: 1123 | 8 Nice SQL code, statcompute!
#Here is cleaner version of Python code using itertools library
#Python 2.6
import itertools
txt ='''
firm year
A 1990
A 1995
A 1999
b 1978
b 1979
b 1980
c 2001
c 2009
d 2004
d 2005
e 1999
f 1990
f 1997
f 1998'''
txt_lst = txt.split('\n')
txt_lst.remove('')
txt_lst.remove('firm year')
def checck_consecutive(ls):
length = len(ls)
if len(ls) < 2:
return False
for index, item in enumerate(ls):
if index > 0:
|
D******n 发帖数: 2836 | 9 one fits for all, try this one, you only need to change the %years macro
variable to account for whatever consecutive years u want to look at. |