R********y 发帖数: 4018 | 1 【 以下文字转载自 SCU 讨论区 】
发信人: RainingDay (山寨大王), 信区: SCU
标 题: statistics 问题求助
发信站: BBS 未名空间站 (Fri Apr 22 15:21:16 2011, 美东)
无以为报,只有包子酬谢。
问题如下:
我有一组data, 每个entry 有 年龄, activity, body type, etc, etc.
我现在需要decide two tier activity profile, 比如ten years and newer's
activity will be: x hrs/yr, ten years or older's will be: y hrs/yr,
my question is: how do i decide the cut point age?
==================
sample data set:
entry body type activity (hrs/yr) age
1 x 50 40
2 y 40 36
3 z 100 30
.
.
.
n th | R********y 发帖数: 4018 | 2 这个是 scattered 图。
【在 R********y 的大作中提到】 : 【 以下文字转载自 SCU 讨论区 】 : 发信人: RainingDay (山寨大王), 信区: SCU : 标 题: statistics 问题求助 : 发信站: BBS 未名空间站 (Fri Apr 22 15:21:16 2011, 美东) : 无以为报,只有包子酬谢。 : 问题如下: : 我有一组data, 每个entry 有 年龄, activity, body type, etc, etc. : 我现在需要decide two tier activity profile, 比如ten years and newer's : activity will be: x hrs/yr, ten years or older's will be: y hrs/yr, : my question is: how do i decide the cut point age?
| R********y 发帖数: 4018 | 3 oh, you mean the linear line?
the graph was generated at the very early stage of the project.
then i had the geometric mean calculation. logged and then de-log back to
see the distribution, all the zero activity data points had to throw out.
STDEV and 95% confidence interval all didn't came out well enough to use.
that's why came out with this two tier approach, but boss ask me why i pick
10 years instead of say 8.
i don't know how to weight each usable data points, because activity is
another variable in this case.
ugh.... headache. |
|