f********e 发帖数: 34 | 1 You fit a binary logistic regression to classify Z=1 or Z=0. You get the
best-fit regression coefficients Beta_0, Beta_1, Beta_2, and Beta_3 such
that log( P(Z=1)/P(Z=0) )= Beta_0 + Beta_1*A + Beta_2*B + Beta_3*C and the
likelihood of the training set is maximized. A, B, and C are continuous
variables.
Q1: Suppose you are training your logistic regression as described above.
You fit your logistic regression model on the training set and test it on
the same data set. Your accuracy rate is 98%. Yo | s*r 发帖数: 2757 | 2 when you see the world t'raining', you know you are talking with a AI/data
mining guy. they like cross validation and dividing the sample into a
training set and a xx set (forget the name, scoring or validation?)
just read some introductory stuff for CART | s*********e 发帖数: 1051 | 3 lucky guy.
they are open-ended questions. | h***i 发帖数: 3844 | 4 overfitting了?
be
【在 f********e 的大作中提到】 : You fit a binary logistic regression to classify Z=1 or Z=0. You get the : best-fit regression coefficients Beta_0, Beta_1, Beta_2, and Beta_3 such : that log( P(Z=1)/P(Z=0) )= Beta_0 + Beta_1*A + Beta_2*B + Beta_3*C and the : likelihood of the training set is maximized. A, B, and C are continuous : variables. : Q1: Suppose you are training your logistic regression as described above. : You fit your logistic regression model on the training set and test it on : the same data set. Your accuracy rate is 98%. Yo
| b*******r 发帖数: 152 | 5 1. overfitting.
2. decision tree, like cart. Neural Network could be another try.and many
others.... | h***i 发帖数: 3844 | 6 training data的准确率有98%,sample size 有至少40,sample size 也不算太小。
testing data 怎么取得,size 多大?
be
【在 f********e 的大作中提到】 : You fit a binary logistic regression to classify Z=1 or Z=0. You get the : best-fit regression coefficients Beta_0, Beta_1, Beta_2, and Beta_3 such : that log( P(Z=1)/P(Z=0) )= Beta_0 + Beta_1*A + Beta_2*B + Beta_3*C and the : likelihood of the training set is maximized. A, B, and C are continuous : variables. : Q1: Suppose you are training your logistic regression as described above. : You fit your logistic regression model on the training set and test it on : the same data set. Your accuracy rate is 98%. Yo
| c*****l 发帖数: 135 | | y******0 发帖数: 401 | 8 1. Overfitting.
2. For the missing values A or B. Check the the missing ratio. If the ratio
is more than 50%, maybe you should drop this variables. Create indicators
for the missing values and use the indicator in the model as a input
variable either. Impute the missing values using mean, median, regression,
or multiple imputation methods based on the data structures.
It is hard to find the 'best' imputation method, but you have to try. |
|