由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
Statistics版 - 样本数量问题求助
相关主题
Question for Stratify sampling.onsite求建议呀
Ask a question about one sample test讨论一下,非独立sample的显著性比较
抽样问题求助求助 R sample in matrix form
请问k-mean clust或decision tree或stratify sampling?大牛们,帮忙解决个很小的统计问题
这样还能算Randomized sample吗弱问一个概念
sampling weight variable怎么用到linear regression里啊?请问SAS里面有没有类似R里面的sample之类的抽样函数?如果没有怎样实现呢
用SAS sampling的一个问题问个面试问题
急求 马上要选课了 谢谢各位大神海量SAS data的处理
相关话题的讨论汇总
话题: sampling话题: region话题: regions话题: stratified话题: samples
进入Statistics版参与讨论
1 (共1页)
q********n
发帖数: 355
1
一个典型的样本数量问题,想讨论一下。统计全省的某一电力设备的健康状况。
比如全省有1万台,历史数据表明合格率为80%。要确定至少需要检测多少台才能够代表
全体,有个经典的公式如下
n=(z^2)p(1-p))/(e^2+(z^2)p(1-p))/N)=(1.96^2*0.8*0.2)/(0.02^2+(1.96^2*0.8*0.2
)/10000)=1332
即检测1332台就可以保证95%置信区间精确度80±2%。
现在的问题是,想知道是否有必要在南北两个区内单独抽样,是否需要在每个市单独抽
样。个人觉得这要根据历史数据先计算每个区和每个市的合格率。如果和总体合格率相
同,就没有必要。但是这在统计上有没有什么相关的理论和方法呢,谢谢!
g******2
发帖数: 234
2
It's the stratified sampling in variance reduction. If a function is
homogeneous across all region, then there's not need to do stratified
sampling. If a function is heterogeneous across regions but homogeneous
within sub-regions, you'll gain efficiency by stratified sampling.
For example, if you have 2 regions and each has 5000 machines. Suppose the
defective% for region 1 is 0.1 and 0.3 for region 2. Then in this case
sampling n samples equally from region 1 and region 2 will be more efficient
than randomly sampling 2n samples from all regions. The ratio of the
variance should be ((0.9*0.1+0.7*0.3)/2) / (0.8*0.2) = 15/16, which means
you'll need 1/16 less samples.
q********n
发帖数: 355
3
thanks

efficient

【在 g******2 的大作中提到】
: It's the stratified sampling in variance reduction. If a function is
: homogeneous across all region, then there's not need to do stratified
: sampling. If a function is heterogeneous across regions but homogeneous
: within sub-regions, you'll gain efficiency by stratified sampling.
: For example, if you have 2 regions and each has 5000 machines. Suppose the
: defective% for region 1 is 0.1 and 0.3 for region 2. Then in this case
: sampling n samples equally from region 1 and region 2 will be more efficient
: than randomly sampling 2n samples from all regions. The ratio of the
: variance should be ((0.9*0.1+0.7*0.3)/2) / (0.8*0.2) = 15/16, which means
: you'll need 1/16 less samples.

1 (共1页)
进入Statistics版参与讨论
相关主题
海量SAS data的处理这样还能算Randomized sample吗
请教一个Sample Size的问题在Categorical Data Analysissampling weight variable怎么用到linear regression里啊?
一个Sample Size计算的问题用SAS sampling的一个问题
请教各位统计大拿一个categorical data的sample size 够不够大的问题急求 马上要选课了 谢谢各位大神
Question for Stratify sampling.onsite求建议呀
Ask a question about one sample test讨论一下,非独立sample的显著性比较
抽样问题求助求助 R sample in matrix form
请问k-mean clust或decision tree或stratify sampling?大牛们,帮忙解决个很小的统计问题
相关话题的讨论汇总
话题: sampling话题: region话题: regions话题: stratified话题: samples