b******y 发帖数: 9224 | 1 seems like you need to write a simple program and utilize a stemmer. For
example, porter stemmer is famous. |
|
l***d 发帖数: 1798 | 2 FOR IMMEDIATE RELEASE
National Academy of Engineering Elects 66 Members and 10 Foreign Associates
WASHINGTON — The National Academy of Engineering (NAE) has elected 66 new m
embers and 10 foreign associates, announced NAE President Charles M. Vest to
day. This brings the total U.S. membership to 2,254 and the number of forei
gn associates to 206.
Election to the National Academy of Engineering is among the highest profess
ional distinctions accorded to an engineer. Academy membership honors th... 阅读全帖 |
|
l*******s 发帖数: 1258 | 3 提供一些思路
1.用Bag of Words一般就可以,因为毕竟书评和小说里面不少词的分布还是很不一样,
比如某些表示喜欢不喜欢的词。另外可以加上其他的feature,比如文本长短,有无作
者,有无题目,里面的特殊标点符号等。在做文本预处理时,不要用stemmer和全部小
写化,因为这样会丢失很多morphological feature,你想想写评论时肯定跟小说文本
在这些方面有区别。
还有就是Naive Bayes。这里可以用Multinomial NB,应该比Bernoulli NB效果好,因
为文本相对比较多,且用词占vocabulary比重大,因此Multinomial模型好些。
在做feature时,比较一下binary,count,TFIDF,看看那个效果好。一般来说,文本
少的话,binary好些,TFIDF会比较偏向高variation,而降低bias。
另外还可以考虑加紧regularization,L1 L2那套,防止bias和variation问题。
基本上这个问题这么回答,照顾到了各方面。
2.没做过真实project 不清楚。
3.考虑Ordinal reg... 阅读全帖 |
|
l***y 发帖数: 791 | 4 reading, speech, writing, these are all tied together so practicing
one should improve others. as to speech practice, public speeches
are more or less rehearsed (and scripted) so i wouldn't dream of talking
like that, not if i can help it! to me practicing speech would means casual
talks between friends, and about things I know or have thought about.
confidence is quite a key factor in a verbal performance.
one tends to stemmer in a difficult audition no matter what language is
on for the occass |
|
L******r 发帖数: 199 | 5 安装了Lingua::Stem,觉得效果很差
properties的原型都弄不对,给了properti;
goes==>goe. |
|
t*****g 发帖数: 1275 | 6 试试Martin Porter的?feet,foot好像不灵不过。 |
|
|
i**i 发帖数: 1500 | 8 full text search 在这种情况下效果不一定好。
lucene的stemmers在这里没有道理 |
|
H****S 发帖数: 1359 | 9 Lucene 基本上是吧document看作bag of words,所以如果希望abc是在document的最前
面,best bet是用Term payload.
关于第二个问题,可以去看看Lucene提供的Porter stemmer |
|
h**********r 发帖数: 671 | 10 Thanks!
F. Arnold读了许多。
结构这块儿我是handle不了的~~
也认识几个早期在stemmer的maxygen工作过的人~ |
|
q***l 发帖数: 177 | 11 National Academy of Engineering Elects 66 Members and 10 Foreign Associates
WASHINGTON — The National Academy of Engineering (NAE) has elected 66 new
members and 10 foreign associates, announced NAE President Charles M. Vest
today. This brings the total U.S. membership to 2,254 and the number of
foreign associates to 206.
Election to the National Academy of Engineering is among the highest
professional distinctions accorded to an engineer. Academy membership
honors those who have made outstand... 阅读全帖 |
|
g******6 发帖数: 782 | 12 http://www.nae.edu/56154.aspx
National Academy of Engineering Elects 66 Members and 10 Foreign Associates
WASHINGTON — The National Academy of Engineering (NAE) has elected 66 new
members and 10 foreign associates, announced NAE President Charles M. Vest
today. This brings the total U.S. membership to 2,254 and the number of
foreign associates to 206.
Election to the National Academy of Engineering is among the highest
professional distinctions accorded to an engineer. Academy membership
honor... 阅读全帖 |
|