G***G 发帖数: 16778 | 1 when support vector machine method is used for model building and prediction,
should a normalization/transformation method be used for the raw data?
I have two columns of data which will be used for training
age sex
65 1
77 0
88 0
23 1
do I need to do some transform for the data in order to train SVM? |
w**********y 发帖数: 1691 | 2 I think so. But most packages might do these preprocess automatically. U can
try.
Theoretically, it seems that the results have no difference if you don't use
kernel (not 100% sure). But even so, it still might be helpful in
computations. So that is why the normalization is already recommend, even
for simple linear regression. |
c****y 发帖数: 94 | 3 Yes, there is no doubt that you should try to normalize your data first. It
is mainly to avoid some predictors be ignored if they have smaller
variance compare to other predictors. |
G***G 发帖数: 16778 | 4 after normalization, is transformation needed such as tranformation to a
normal distribution?
It
【在 c****y 的大作中提到】 : Yes, there is no doubt that you should try to normalize your data first. It : is mainly to avoid some predictors be ignored if they have smaller : variance compare to other predictors.
|
G***G 发帖数: 16778 | 5 it is already normalized.
I want to ask whether transformation is needed.
transformation to a normal distribution.
can
use
【在 w**********y 的大作中提到】 : I think so. But most packages might do these preprocess automatically. U can : try. : Theoretically, it seems that the results have no difference if you don't use : kernel (not 100% sure). But even so, it still might be helpful in : computations. So that is why the normalization is already recommend, even : for simple linear regression.
|
l*******m 发帖数: 1096 | 6 我也在思考这个问题。不过,可以确定的是如果要做distribution transformation,
也要在normalization之前,否则就白做了。
我现在的想法是
如果做减mean除std的normalization, feature接近Gaussian应该好些
如果做scaled normalization(normalized to [0, 1]), feature如果接近uniform应
该好些
我的出发点是一种normalization只是针对一种r.v.是最优的。所以transformation和
normalization应该综合考虑
【在 G***G 的大作中提到】 : it is already normalized. : I want to ask whether transformation is needed. : transformation to a normal distribution. : : can : use
|
l*******m 发帖数: 1096 | 7 feature sex is already binary. you should not touch it anymore.
prediction,
【在 G***G 的大作中提到】 : when support vector machine method is used for model building and prediction, : should a normalization/transformation method be used for the raw data? : I have two columns of data which will be used for training : age sex : 65 1 : 77 0 : 88 0 : 23 1 : do I need to do some transform for the data in order to train SVM?
|