S******y 发帖数: 1123 | 1 I am using PROC LOGISTIC to model binary outcomes.
I have observed Y (1 or 0) from original data.
I also have got predicted probability for each observation (i.e. predicted
probability of event Y=1) from PROC LOGISTIC. Let us call it - p_hat.
for example, I would have two columns -
Y p_hat
1 0.6
0 0.3
1 0.45
...
I would like to build a 2X2 classification table -
Y =1, Y =0 vs. Y_hat=1, Y_hat =0
to evaluate my classification accuracy.
The challenge is - how to derive Y_hat from p_hat? what |
b******1 发帖数: 367 | 2 there is no universal standards to set the cutoff point. |
o****o 发帖数: 8077 | 3 this really depends on your business context and hence the definition of
threshold.
for example, if you want to max your Class=1 outcome in top 30%, then you
can define whoever show up in top 30% in terms of your p_hat as Y_hat=1, ...
.
this is one reason I think Stats jobs that is integrated with business can't be outsourced |
S******y 发帖数: 1123 | 4 Thanks both of you for prompt help!
Happy Friday! |
S******y 发帖数: 1123 | 5 Do you have to come up with cutting point also for KNN, Neural Net,
RandomForest and rpart prediction?
Isn't it true that if you treat binary response as factor, you can get
predicted 0,1 without making up cutting point?
Thanks. |
o****o 发帖数: 8077 | 6 KNN uses some form of majority vote bah? so there is a nature step function
RF also relies on a similar rule as majority vote, right?
【在 S******y 的大作中提到】 : Do you have to come up with cutting point also for KNN, Neural Net, : RandomForest and rpart prediction? : Isn't it true that if you treat binary response as factor, you can get : predicted 0,1 without making up cutting point? : Thanks.
|
D******n 发帖数: 2836 | 7 check out ROC and AUC.
【在 S******y 的大作中提到】 : I am using PROC LOGISTIC to model binary outcomes. : I have observed Y (1 or 0) from original data. : I also have got predicted probability for each observation (i.e. predicted : probability of event Y=1) from PROC LOGISTIC. Let us call it - p_hat. : for example, I would have two columns - : Y p_hat : 1 0.6 : 0 0.3 : 1 0.45 : ...
|
S******y 发帖数: 1123 | 8 Thank both of you!
I tried -
factor(out_come) ~ .
in both kknn and logistic, it seems working out! |