s*******d 发帖数: 1027 | 1 如果我想从一个网页抓取数据, 然后最好能导入excel,或者写成txt文件也行,应该
用什么语言编? Javascript么?
我只用过C/C++,也算不上精通。 多谢 |
w***y 发帖数: 78 | |
c********g 发帖数: 449 | 3 Java, c++ 等待。BUT NOT javascript. |
o*****8 发帖数: 192 | 4 python got some nice open source libraries on html parsing like http://www.crummy.com/software/BeautifulSoup/ .
It is kind of cool.. |
s*******d 发帖数: 1027 | |
b******n 发帖数: 592 | 6 Python is good. HTML fetching is a pain because they don't normally follow
the rule. Tags can be missing in many cases. But if you want to fetch from
quality website, beautifulsoup is great. I developped the same program for
my previous company. It works great.
Another thing is try to limit your access to the website. Too many threads
may get your ip banned.
【在 o*****8 的大作中提到】 : python got some nice open source libraries on html parsing like http://www.crummy.com/software/BeautifulSoup/ . : It is kind of cool..
|