D******n 发帖数: 2836 | 1 就是对每个variable 在整个dataset里面missing value 数量的统计。
要输出一个table。
_var_ _missfreq_
x 11
y 222
z 33
... ... | l*********s 发帖数: 5409 | 2 a stupid solution, transpose then nmiss function | R*********i 发帖数: 7643 | 3 I am not sure what the question is -
proc freq with the "missing" option at the table statement for each variable
will summarize # of missings at each variable. Then you can select the
record with missing counts from the output datasets, and combine them
together to generate an output table as in your post. | D******n 发帖数: 2836 | | s*****r 发帖数: 790 | 5 because the variables maybe either numeric or non-numeric, you may consider
first create auxiliary numeric variables using the function missing. then
use proc means on all the auxiliary variables.
you can write a macro to do it automatically. something like:
%macro missing(indata, outdata);
proc contents data=indata out=_tmp;
run;
proc sql;
select name into: _name_list
from _tmp;
quit;
data _t1;
set indata;
%do loop for each variable in _name_list;
var_aux=missing(var);
run;
proc means data=_t1;
var &-name_list ;
output out=
run;
%mend;
【在 D******n 的大作中提到】 : 就是对每个variable 在整个dataset里面missing value 数量的统计。 : 要输出一个table。 : _var_ _missfreq_ : x 11 : y 222 : z 33 : ... ...
| o****o 发帖数: 8077 | 6 use hash table to store the missing counts for both numeric and char vars,
hash key is variable's name and hash data are the name and the counts
set up array _n{*} _Numeric_;
array _c{*} _Character_;
loop through both array, using vname to check the name of the array elements
, and using missing() to check if it is missing, no matter char or num
variable. If this is the first time this variable is encountered, add to
hash, else replace
done
better output format than PROC FREQ nlevels
【在 D******n 的大作中提到】 : 就是对每个variable 在整个dataset里面missing value 数量的统计。 : 要输出一个table。 : _var_ _missfreq_ : x 11 : y 222 : z 33 : ... ...
| D******n 发帖数: 2836 | 7 ya, i remember your method, it works great for character variables.
I just figured out that for numeric variables, proc means + transpose
can do the work and it is much simpler.
For character variables, i am still thinking if there is a simpler way
than yours...lol
vars,
elements
to
【在 o****o 的大作中提到】 : use hash table to store the missing counts for both numeric and char vars, : hash key is variable's name and hash data are the name and the counts : set up array _n{*} _Numeric_; : array _c{*} _Character_; : loop through both array, using vname to check the name of the array elements : , and using missing() to check if it is missing, no matter char or num : variable. If this is the first time this variable is encountered, add to : hash, else replace : done : better output format than PROC FREQ nlevels
| o****o 发帖数: 8077 | 8 think out of box, you can:
first transpose, then use PROC FREQ:
proc format ;
value $charmiss
' '='missing'
other='Non-Miss'
;
run;
data test;
array _c{*} $ c1-c5;
do id=1 to 20;
do j=1 to dim(_c);
if ranuni(5555)<0.2 then _c[j]=' ';
else _c[j]='A';
end;
output;
drop j;
end;
run;
data testv/view=testv;
set test;
id=_n_;
run;
proc transpose data=test out=test_t;
by id;
var c1-c5;
run;
proc freq data=test_t noprint;
table _name_*col1/missing out=_char;
format col1 $charmiss.;
run;
data _null_;
set _char;
put _all_;
run;
【在 D******n 的大作中提到】 : ya, i remember your method, it works great for character variables. : I just figured out that for numeric variables, proc means + transpose : can do the work and it is much simpler. : For character variables, i am still thinking if there is a simpler way : than yours...lol : : vars, : elements : to
| D******n 发帖数: 2836 | 9 nice solution!
But wonder how efficient this is compared to the hash one. | o****o 发帖数: 8077 | 10 hash solution is the fastest and most flexible
【在 D******n 的大作中提到】 : nice solution! : But wonder how efficient this is compared to the hash one.
|
|