由买买提看人间百态

topics

全部话题 - 话题: scrapes
首页 上页 1 2 3 4 5 6 7 8 9 10 (共10页)
i**i
发帖数: 1500
1
来自主题: Programming版 - web scraping有啥方便的API或者框架不
https://github.com/cheeriojs/cheerio 很好用
http://phantomjs.org/ 没用过,比较牛。
c********l
发帖数: 8138
2
来自主题: Programming版 - web scraping有啥方便的API或者框架不
selenium内核就是htmlunit吧
c********l
发帖数: 8138
3
来自主题: Programming版 - web scraping有啥方便的API或者框架不
phantomjs口碑相当不错
w****k
发帖数: 6244
4
来自主题: Programming版 - web scraping有啥方便的API或者框架不
scrapy + beautifulsoup4 in python
t**r
发帖数: 3428
5
来自主题: Programming版 - web scraping有啥方便的API或者框架不
赞,正打算找
l****t
发帖数: 228
6
来自主题: Programming版 - web scraping有啥方便的API或者框架不
嗯 动态页面 或者很多交互的话phantomjs不错
如果只是静态页面的话python beautifulsoup就可以搞定了
c********g
发帖数: 1173
7
【 以下文字转载自 Apple 讨论区 】
发信人: cosmorning (Sleeping pig), 信区: Apple
标 题: 买不到iPhone 6/6+的同学看过来
发信站: BBS 未名空间站 (Mon Oct 27 12:17:54 2014, 美东)
我写了个app来抓附近Apple Store的inventory。一旦发现,可以给你发realtime
notification和email。我自己用这个tool抓了两个iPhone 6/6+。
我刚把这个app open source了,你可以到这里下载,然后自己build, install:
http://github.com/ychw/iPhone6Radar
因为这个app scrape了Apple的网页,所以没办法放到app store上。你要是自己不会
build,就只能找朋友帮忙了。或者你也可以用istock.us,但不能根据距离来filter结
果。
H******7
发帖数: 1728
8
用别人的数据 再整合自己的服务 非法么?
★ 发自iPhone App: ChineseWeb 8.7
W***o
发帖数: 6519
9
Google does this all the time, so what the fuss?
e*******o
发帖数: 4654
c******n
发帖数: 16666
l**********n
发帖数: 8443
c******n
发帖数: 16666
13
来自主题: Programming版 - 小白问网页scraping 的一个问题
没有api就模仿用户行为呗
用selenium或者headless chrome
然后要挂代理 或者去他们aws自己开一堆机器换着跑
我相信马鬃自家肯定有人来防备这个的
根据你的访问模式来封你ip或者扔你假的数据
l****r
发帖数: 119
14
来自主题: Programming版 - 离成功转码还有多远?
我是EE转码的,去年毕业,公司码农的工作没找到,在一个医院做了半年(号称研究机
器学习)的博后(工资低),说说感受:
能接触真实的医疗数据,数据库里有病人就诊记录包括吃的药和化验指标,但是数据比
较乱也很深奥,没有医疗行业知识,不懂药名和化验指标是什么意思,现学的话总觉得
差好远。所以,老板说让做什么就做什么。主要用python,pandas,sklearn,某些问
题用R做。研究的问题感觉都比较trivial。感觉都不是真正的机器学习,是简单的算数
分析。问题的定义也不是特别清楚。
还做了几个NLP的小project,论文的聚类,涉及到:
web scraping, 把论文从网上扒下来,用python beautiful soup, asyncio
feature 提取:用一个Medical Text Indexer (MTI)的网络API,把医学论文的关键字
找出来
machine learning算法: 用了LDA和k-means,都是调用sklearn库
pandas用的还挺熟的,各种groupby,apply。但去看job description,好多都要求会
tensorf... 阅读全帖
s*******e
发帖数: 23
15
来自主题: Biology版 - Re: who makes the Competent cell called
no,
at the same time, label a fresh LB plates at different regions.
soon after you mix your bacteria in TE, scrape your toothpicks on
the LB plates. So you will have a backup plates.
on a single LB plates, I can backup 8-16 colonies. so I drop the
same number of drops TE on parafilm.
h**********r
发帖数: 671
16
来自主题: Biology版 - yeast miniprep
同上另外一种方法。
Yeast “smash-and-grab” DNA prep:
• pipette 1-2 ml YPD onto transformation plate; scrape colonies off
with the end of a glass slide and pipette into a microfuge tube
• spin down 15 sec; pipette off sup; if cell pellet is more than 50-75
μl, remove excess and discard
• to cell pellet add 0.2 ml lysis buffer, 0.2 ml phenol/CHCl3 and 0.3
g 0.45-0.5 mm glass beads* (I use calibrated scoop made from cut microfuge
tube pierced with syringe needle) – seal carefully because be... 阅读全帖
b******r
发帖数: 111
17
Note: NHEs is membrane proteins that transmembrane 12 times. NHE5 is
localized to the plasma membrane;NHE6-9 reside on organellar membranes(
endosome,Golgi);
1. pcDNA transfects NHE5-9 through Fugene 6(ratio is 3 uL of Fugene/2 ug of
DNA) in 293 cells. After three days,collect cells.
2. Cold PBS washes cells. Add 150 uL of lysis buffer(final concentration:
50mM Tris pH7.4, 150mM NaCl, 1% triton, 0.1% sds, cocktail inhibitor)
into each well of 6-well plate. Scrape cells.
3. The lysate gets st... 阅读全帖
m**********d
发帖数: 137
18
IHC及IF显示蛋白A在tumor cell的细胞核里高表达,蛋白A的canonical function是与
蛋白B bind并激活B。一些preliminary data提示蛋白A有新的功能,就是与蛋白C
bind而调控另一个重要的生物过程。
现在需要得到A与C直接作用的证据,下面这个用Dynal magnetic beads做CO-IP的
protocol:
10μl Dynal beads coated with protein A+10μl Dynal beads coated with
protein G each reaction, wash with ice cold PBS plus 5% BSA, three times.
Incubate with antibody against 蛋白A (20μl, approximately 4μg) in 500μl
PBS plus 5%BSA, overnight.
2^106 cells/10cm plate, treatment (which induces A and C interaction while
... 阅读全帖
n*******n
发帖数: 515
19
版上高手众多,如果知道答案,不吝赐教。
我是做有机合成的,平时也会养一些细胞, 主要是cancer cell lines。
这次被老板指派了一个project需要primary neuron cells。这可从来没做过,试了几
次,可每次harvest neuron cells用trypsin或是scraping的话要不然下不来,要不然
百分之六七十的细胞已经挂了。。。
请有经验的同学指点一下吧,有包子。谢谢
K****n
发帖数: 5970
20
嗯,这个听起来可以scrape linkedin api来搞
d*******s
发帖数: 24
21
这个问题感觉是做wound healing assay都面临的问题, 不知道文献中有没有报道。
Here I came up an idea to deal with this problem, I don't know if it will
work, up to your decision:
Rather than scraping the cells using pipette tip or any other tools which
generates the gap(or wound), half plate of the cells can be removed. Then
the growing up of the cells at the empty area may represent the cell
proliferation. The moving of the single edge may represent the wound healing
.
f******g
发帖数: 1003
22
来自主题: Biology版 - 由拔火罐刮痧所想到的
最近嗓子疼,自己揪了揪,感觉舒服多了,我们家那里这种做法很流行。
我相信这背后一定有生物学原理。
由此联想到拔火罐刮痧,其实原理差不多。
§§§§§§§§§§§§§§§§baidu§§§§§§§§§§§§§§§§§
刮痧(Skin scraping)是中国传统的自然疗法之一,它是以中医皮部理论为基础,用
牛角、玉石等在皮肤相关部位刮拭,以达到疏通经络、活血化瘀之目的。刮痧可以扩张
毛细血管,增加汗腺分泌,促进血液循环,对于高血压、中暑、肌肉酸疼等所致的风寒
痹症都有立竿见影之效。经常刮痧,可起到调整经气,解除疲劳,增加免疫功能的作用。
§§§§§§§§§§§§§§§§baidu§§§§§§§§§§§§§§§§§
拔罐法又名“火罐气”“吸筒疗法”,古称“角法”。这是一种以杯罐作工具,借热力
排去其中的空气产生负压,使吸着于皮肤,造成郁血现象的一种疗法。古代医家在治疗
疮疡脓肿时用它来吸血排脓,后来又扩大应用于肺痨、风湿等内科疾病。建国以后,由
于不断改进方法,使拔罐疗法有了新的发展,进一步扩大了治疗范围,成为针灸治疗中
的一种重要疗法。 王敬编著的《拔罐》图书由北京科学技术出版社出版。
... 阅读全帖
s******y
发帖数: 28562
23
http://theconversation.com/nobel-laureate-weve-just-scraped-the
在最近一个访谈中,2007诺奖获得者Martin Evans(由早期对胚胎干细胞以及转基
因老鼠的工作而获奖)对最近的两篇被retract的 Nature 上的干细胞文章(一篇是关
于物理压力,另外一篇是关于酸液)直言不谓的对Nature的editoral process表示不满
。并呼吁年轻科学工作者们不要全部相信文章上的所有说法而是要仔细看其中的数据并
做自己的独立判断。
Q: Reprogramming has also been in news notoriously recently. Two Nature
papers that showed that differentiated cells can be reprogrammed by physical
pressure or acid treatment were retracted this week. What’s your take on
that?
A: I was surprised... 阅读全帖
r******k
发帖数: 446
24
还想问一句,SDS loading buffer直接scrape下来的时候 是不是就不用加protease
inhibitor 还有 phospho-stop了??谢谢
r******k
发帖数: 446
25
小弟最近做一个蛋白的磷酸化 以为没有特异性的抗体。 只能IP total的protein 然后
用Pan-Tyr。 但是这个磷酸化总是出不来。班上有大牛说用预热的(90度以上?)SDS
loading buffer 挂下来以后直接boil再稀释10-20倍做IP。 但是有一个问题,10cm的
dish怎么也要用500ul的loading buffer scrape下来吧? 那也无法稀释了呀?
难道先用pbs刮下来?? 而且这个PBS是冰的把?而且还有protease inhibitor 还有
phospho-stop?
g**a
发帖数: 2
26
来自主题: CivilEngineering版 - 请教:仿照鸟巢的建筑的例子
写一点关于鸟巢仿生建筑的东西,需要各类例子,但我只知道奥运主会场,网上也找不
到什么,所以想请教一下学建筑的,或许你们比较熟悉其它的鸟巢建筑例子。
需要灵感的话请参考以下鸟巢分类文字和图片:
鸟巢一般分六类:
Cup 杯状巢,平常见到大多数鸟巢的样子(奥运主会场取的就是这形状)
Platform 平台式,如一些大型猛禽
Scrape 在地上刨刨就好,如野鸡
Cavity 洞巢,像啄木鸟;或者"Burrow"在崖壁上的洞,像燕子
Sphere 挂着的球状编织巢,有一个开口,如织巢鸟;或者"Pendant"长柄挂起来的编
织巢,经常一群一群的一起
Mound 塚巢,爸妈在地上刨个坑埋上树叶等,把蛋生在上面再埋上完事,树叶腐烂
后产生温度自动孵小鸟,省事
整理了wiki上的鸟巢类型图片贴成一张,上传为附件了,无法下载的话请点:http://en.wikipedia.org/wiki/Bird_nest 不过网页上比较乱就是了
S******y
发帖数: 1123
27
来自主题: Computation版 - Python- scraping "Computation" 版
我有一个Python script 用来抓“Computation”版 的帖子。
S***w
发帖数: 1014
l********a
发帖数: 1154
29
来自主题: Computation版 - Python- scraping "Computation" 版
可以抓任意版面,任意多层(下页)
修改main函数的url和layer即可
#! usr/bin/env python
from urllib import urlopen
from urllib import urlopen
from BeautifulSoup import BeautifulSoup
import re
def fetchPage(page,urlBase):

# get current page

soup = BeautifulSoup(page)
# get all title and links
dicList = {}
for header in soup('strong'):
links = header('a','news1')
if not links: continue
dicList[links[0].string] = links[0]['href']
# display results
for key,value in di
k**e
发帖数: 2728
30
why dont you add lysis buffer directly into the wells, instead of rinsing ce
lls off the plate with PBS? you can pre-rinse cells with ice cold PBS to mak
e sure they are cold (on ice the whole time), and aspirate off the PBS buffe
r completely, then directly add lysis buffer into the well, and scrape every
thing off with cell scrapers, then collect everything into the tubes. to ext
ract proteins, sit on ice for 10 min then shake for 30 min before spinning d
own for 10 min. the sup will be the ex
s*******d
发帖数: 1079
31
来自主题: MedicalCareer版 - FA 2011 vs 2010 对照分析
学友MILK找到的这个FA 2011版 vs 2010版的详细对照,我觉得很有帮助,放
在这儿给大家做参考吧。
原文见 http://forums.studentdoctor.net/showthread.php?t=785943
Overall:
- FA 2011 corrects some of the 2010 errata
- FA 2011 has darker print and thicker pages
- FA 2011 increases the size of most figures/images (although only slightly)
- FA 2011 removes several images that are difficult to see/interpret
Behavioral Science:
p. 59-60 2010 (57-58 2011) -2011 reorganized a couple items into "Advance
directives" and got rid of "Good Samaritan Law.... 阅读全帖
L********r
发帖数: 37
32
来自主题: MedicalCareer版 - CK NBME 2 Block 4 - Q 11
11. A 2-day-old newborn is brought to the physician because of a
generalized rash for 6 hours. The newborn is active, alert, and feeding
well. His temperature is 36.9 C (98.4 F). Examination shows a rash
consisting of numerous white and pale yellow papules with a large base of
macular erythema over the trunk and extremities. Wright's stain of
scrapings from the lesions shows eosinophils. Which of the following is
the most appropriate next step in management?
A) Reassurance
B) Topical corticoster... 阅读全帖
b***k
发帖数: 2673
33
来自主题: Quant版 - [合集] two probability questions
☆─────────────────────────────────────☆
littleegg (爱吃茶叶蛋) 于 (Sun Jun 8 14:39:07 2008) 提到:
(1)suppose there are n people in an office, at Christmas, they have a random
gift exchange in which every name is written on scrapes of paper, mixed
around in a hat, then everyone draws a name at random to determine who they
are to get a gift for. What is the probability nobody draws their own name?
(2)what is the expected number of random numbers, uniformly distributed from
0 to 1 needed to the sum t
m****0
发帖数: 32
34
Earlier this fall, Steve Ferdman celebrated getting a job offer Credit
Suisse in the usual Wall Street fashion. Over expensive oysters and dark rum
cocktails at a trendy Manhattan restaurant with his parents, he toasted
landing the full-time position after working six months as a consultant
without benefits.
A week later, Mr. Ferdman, 28, sat alone at the same place and ordered a gin
and tonic to lament getting laid off by the bank, for the second time since
2008. When he told the bartender abou... 阅读全帖
S******y
发帖数: 1123
35
来自主题: Statistics版 - Python - scraping 统计版 - 如何翻页?
近来越来越热爱“统计”版 , 觉得“统计”版 的帖子都很有价值!
我有一个Python script 用来抓“统计”版 的帖子。
l*********s
发帖数: 5409
36
来自主题: Statistics版 - Python - scraping 统计版 - 如何翻页?
Google mechanize package.
S******y
发帖数: 1123
37
来自主题: Statistics版 - Python - scraping 统计版 - 如何翻页?
Thanks. littlebirds!
I did not know Python has this mechanize package too!
BTW, what if I would like to add a cool feature to my Python scripts --
For example,
if there is a new post spotted here from 统计版celebrities such as SongKun,
oloolo, Dashagen, PaperTigera, qqzj, tosi, sir, fanta... , it will send out an
email notice to a designated email address... 8-)
o****o
发帖数: 8077
38
来自主题: Statistics版 - Python - scraping 统计版 - 如何翻页?
这个跟那个有人奔就发电子邮件通知并且把图抓下来的脚本应该很类似;算机版似乎有
人搞过,去哪儿问问吧

SongKun,
out an
s*r
发帖数: 2757
39
来自主题: Statistics版 - Python - scraping 统计版 - 如何翻页?
这个是不是要没隔几分钟扫描一下?
感觉还是挺费系统资源的,
不知道能不能在cterm里运行
f***a
发帖数: 329
40
来自主题: Statistics版 - 【R】how to scrape data from web pages
譬如说怎么在R中自动连接下面这个网址然后把data剥离出来。
http://finance.yahoo.com/q/hp?s=IBM
我知道在matlab里面怎么弄,不知道R里面是不是弄起来也不难。
matlab的话,主要就是用两个function: urlread用来retrieving the webpage,
regexp用
来extracting the date field.
不只谁有经验在R里面怎弄,望不吝赐教,呵呵~
S******y
发帖数: 1123
41
来自主题: Statistics版 - 【R】how to scrape data from web pages
It would be very easy in Python (if it is an option for you) -
http://www.goldb.org/ystockquote.html
import ystockquote
print ystockquote.get_historical_prices('IBM', '20101101','20101207')
#################################################
a****r
发帖数: 1486
42
来自主题: Statistics版 - 【R】how to scrape data from web pages
R里也差不多啊。
看看url,readLines 这些functions
S******y
发帖数: 1123
43
Data Science Training
Classes will be conducted via Skype. You will see my screen throughout the
class.
==>Python for Data Scientist Class <==
http://www.eventbrite.com/e/python-for-data-scientist-tickets-2
You can choose Python I or Python II depending your prior Python level :
Python I
- Installing Python
- Numbers and Expressions
- Variables
- Statements
- Modules
- Strings
- Lists and Tuples
- Dictionary
- Conditionals, Loops and other statements
- Hands on coding (Lab:reading data and parse... 阅读全帖
S******y
发帖数: 1123
44
Data Science Training
由硅谷高科技公司任职的资深数据科学家亲自任教
Real-world example and coding will be included.
课程注重 牢固清晰的概念以及极强的实用性。
Classes will be conducted via Skype. You will see instructor's screen during
the class.
已有多位同学成功转行/转型,通过课程 认真掌握课程资料 实践工业界实例,加上
networking, 在旧金山市及湾区找到心仪的Data Scientist 工作。
==> Hadoop/Hive for Data Scientist Class <==
http://www.eventbrite.com/e/hadoophive-for-data-scientist-class
Hadoop/Hive培训课包括
- Installation
- Hadoop 架构及原理
- Hive 语法及实例
- Map/Reduce 原理及实例
==>Python for Data Scienti... 阅读全帖
S******y
发帖数: 1123
45
来自主题: Statistics版 - 给今年毕业的同学们出一个主意
我们长期提供Python/R/Hadoop/Tableau实战培训课程,由在硅谷科技公司任职多年的
数据科学家及数据工程师任教。
Classes will be conducted via Skype. You will see instructor's screen
throughout the class.
==> Hadoop/Hive for Data Scientist Class <==
http://www.eventbrite.com/e/hadoophive-for-data-scientists-tick
Hadoop/Hive培训课包括
- Installation
- Hadoop 架构及原理
- Hive 语法及实例
- Map/Reduce 原理及实例
==>Python for Data Scientist Class <==
http://www.eventbrite.com/e/python-for-data-scientist-tickets-2
You can choose Python I or Python II depending your ... 阅读全帖
p********y
发帖数: 5141
46
【 以下文字转载自 Cycling 讨论区 】
发信人: penguinfly (flying penguin), 信区: Cycling
标 题: THE TOP 11 CYCLING TECHNIQUE TIPS by Scott Kasin
发信站: BBS 未名空间站 (Thu Jun 7 18:04:00 2012, 美东)
http://www.outsideonline.com/blog/outdoor-adventure/celebrities
When I was 40, I had a heart attack. It came by surprise. I had been, and
still am, a dedicated athlete. Luckily, I survived, and, in 2008, I founded
MI:Aware--MI stands for "myocardial infarction"--to educate people about the
risk of heart attack, which can strike... 阅读全帖
i*********5
发帖数: 19210
47
【 以下文字转载自 Cycling 讨论区 】
发信人: ironman2015 (1/2 ironman x3), 信区: Cycling
标 题: Bicycling's 50 Golden Rules
发信站: BBS 未名空间站 (Tue Oct 16 13:35:05 2012, 美东)
http://www.bicycling.com/training-nutrition/training-fitness/bi
Cyclists are innovators, constantly hunting for an edge. Over the last half-
century, we've tried thousands of methods to become stronger, faster, and
smarter on a bike—many of which have been discarded through the years.
These have endured.
1. To corner, enter wide and exit wide.
2. ... 阅读全帖
A********a
发帖数: 1168
48
来自主题: Animals版 - 动物古怪的求偶和交配方式
据国外媒体报道,人类会选择各种方法来表达对伴侣的爱意,包括红玫瑰、心形巧
克力盒等传统礼物,也有人选择在高档餐厅里吃一顿浪漫的晚餐。这其中当然需要付出
一些努力,但比起自然界中一些动物的求偶方式,人类的方式显然要容易得多——通常
也安全得多。
对大多数动物而言,求爱往往伴随着风险。雄性动物花哨的表演在吸引雌性注意力
的同时,也可能会引来附近的掠食者。雄性之间也常常爆发激烈的争斗,受伤甚至死亡
的情况并不罕见。某些物种的雌性甚至还会以雄性为食,令后者在求爱时不得不“步步
惊心”。
许多动物的求偶和交配行为可能看起来既古怪又充满危险,但这些行为的效果还算
不错。接下来,就让我们盘点动物界中几种非同寻常甚至令人瞠目结舌的求爱方式吧。
超声波情歌
雄性小鼠会通过独特的高音“歌唱”来吸引雌性,甚至能飙到超声波的范围。这种
声音类似哨声,是通过器官和喉部的气流反馈而产生的,与小鼠通常交流时的声音有很
大不同。在2016年发表于《当代生物学》(Current Biology)的一项研究中,科学家
对小鼠发声时的喉部进行了高速拍摄——达到每秒10万帧——从而揭示了这一机制。
尽管这种情歌令人印... 阅读全帖
c*******r
发帖数: 1243
49
来自主题: Archery版 - 反曲20码,有点感觉了
Water hole is a good starting point, but not the only one.
What I learnt is that deer may not be necessary to rely on water holes for
water. The morning dew may be good enought some time.
Even you have to stalk hunt deer, I still not recommend you to shoot a deer
more than 50 yards. Beyond 50 yards, you have less than 50% to hit a deer
and less than 50% to retrieve a deer even you hit it. Too many variation can
lead to fail.
Next are few of my tips that may lead you to success:
- Scouting. Find ... 阅读全帖
z*******n
发帖数: 1034
50
来自主题: MobileDevelopment版 - 苹果的App Store的推荐
免费游戏模式束缚着开发者的双手
发布时间:2014-05-22 17:11:19 Tags:付费模式,免费游戏,市场,手机平台,玩家
作者:Barry Meade
在今年的三月初,Fireproof公开了一个好消息,即自从我们的手机游戏《The Room》
和《The Room 2》发行以来已经卖出了550万份了。
在Fireproof,我们总是能听到一款手机游戏必须是休闲的,且能够免费下载的说法,
作为一种服务的游戏将永久地出现在玩家面前。但因为我们的游戏都很短,既黑暗又残
忍,不存在社交或在线元素,且不包含应用内部购买或广告,所以这便是一大问题。
the room(from d.cn)
the room(from d.cn)
我们同样也缺少足够的钱去支付专业的市场营销或PR。但是依靠着苹果的App Store的
推荐,我们这款基于7万英镑预算的游戏获得了超乎预期的成功。
在对于我们所获得的成绩的评价时,我发了一条tweet表示,也许手机游戏在免费玩家
之间的盈利战导致开发社区更加依赖于“数据之谈”,同时渐渐忽视了一款优秀的游戏
会对玩家产生怎样的影响。关于开发者一味地追求娱乐而不是... 阅读全帖
首页 上页 1 2 3 4 5 6 7 8 9 10 (共10页)