R******d 发帖数: 1436 | 1 有2500个数,每次取不同的7个。如果取25次,期望取到多少个不同的数(每次取的的
数不同,但是某两次可能取到一样的)。取50次,200次呢?
忘了公式怎么算了。如果能提示下怎么用SAS算最好了(非抽样)。谢谢。 | g******2 发帖数: 234 | 2 http://math.stackexchange.com/questions/72223/finding-expected-
so the formula should be 7*(1-a^k)/(1-a), where k is the number of draws and
a=1-sum_{i=0}^{i=6}(1/(2500-i)). At 25, expectation is 169.2375, at 50, 327
.0048, at 200, 1072.772. | g******2 发帖数: 234 | 3 correction: a should be 1-7/2500. | R******d 发帖数: 1436 | 4 谢谢。这个数值看起来合理的,不过我有个问题:现在是从2500里取7个,当这两个数
都变得很大的时候:比如从1000万里取700万,这个公式貌似就不能用了。
and
327
【在 g******2 的大作中提到】 : http://math.stackexchange.com/questions/72223/finding-expected- : so the formula should be 7*(1-a^k)/(1-a), where k is the number of draws and : a=1-sum_{i=0}^{i=6}(1/(2500-i)). At 25, expectation is 169.2375, at 50, 327 : .0048, at 200, 1072.772.
| g******2 发帖数: 234 | 5 why not? pick 7 million from 10 million is equivalent to pick 7 from 10 and
set the multiplier to 7 million. -> 7000000*(1-a^k)/(1-a) where a=0.3.
The following formula is the general formula:
m*(1-a^k)/(1-a), where you pick m unique value from n and repeat k times,
and a=1-m/n.
If you are talking about numerical stability, that's a different topic.
There are lots of ways to improve numerical stability. | R******d 发帖数: 1436 | 6 是的,非常感谢,我看懂了。写了个脚本去抽样,和公式给的结果很接近。
%macro simu(number, ratio, runs);
data data;
do n=1 to &number.;
output;
end;
run;
proc surveyselect data=data out=sample method=srs samprate=&ratio. rep=&runs
. noprint;
run;
proc summary data=sample nway noprint;
class n;
output out=counts;
run;
data result(drop=_type_);
merge data counts;
by n;
if _freq_=. then _freq_=0;
run;
proc sql noprint;
create table summary as
select sum(_freq_^=0)/&number. as rate, mean(_freq_) as mean, &number as
number, &ratio as ratio, &runs. as runs from result;
quit;
%mend;
%simu(10000000, 0.7, 25); | w*******9 发帖数: 1433 | 7 A simple explanation: the expected value should be 2500*P(the first ball
being selected at least once), while P(the first ball will not be selected)
= (1-7/2500)^25. | c******d 发帖数: 5 | 8 楼上大神强大啊!我只能这么绕才能感觉到理解了:
不知哪个不会被选上,但总有不会被选上的;对于某一个,一次不被选上的概率是(1-
7/2500),25次都独立/independent,所以25次都不会被选上的概率是P= (1-7/2500)
^25, 那被选上的概率就是P'=1-P,2500个的期望值就是2500*P'=2500*[1-(1-7/
2500)^25]。 首贴! |
|