s********k 发帖数: 6180 | 1 比方说我有一个sentence 20个words,然后做Word2vec 每个word到300 dimension空间
,如果做NN得训练输入时候,是concatenate成一个1*6000的vector,还是输入20*300的
matrix合适 | m****o 发帖数: 182 | 2 取决于你想用什么类型的网络,如果是CNN的话,那就输入矩阵做卷积。如果你做
entity name resolution, 一般取一个sliding window取context,然后用一个普通的
MLP做训练,那么一个扁平的单向量就可以了。 | m********5 发帖数: 17667 | 3 Normally 20*300 and feed to 1D conv. This means your data point is in 300D
space. And you will process the text word by word.
If you do concatenate to a 6000D vector, you will treat the text similar as
20-gram and arbitrarily define your 20-gram features as this 6000D vector,
which has been seen in many literatures. You can also do other merging
method like averaging, max, etc. This means each 20 words will be a single
data point in the space.
It does not make too much sense to me, if you concatenate and treat it as
6000 long 1D vector though. This will means you treat all 300 features as
similar features and you will convolve over different features the same as
over different words. But with proper stride and filter-length it pretty
much gives you the same result as 20*300 in some cases. It just a very
twisted way to think.
【在 s********k 的大作中提到】 : 比方说我有一个sentence 20个words,然后做Word2vec 每个word到300 dimension空间 : ,如果做NN得训练输入时候,是concatenate成一个1*6000的vector,还是输入20*300的 : matrix合适
|
|