由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
Biology版 - Why you should not do bioinformatics
相关主题
A farewell to bioinformatics zzBioinformatics engineer position opening
bioinformatics postdoc poition($35,000 - $40,000)为什么说进化是一门主流科学
[bssd] 求建议:Comparative genomics vs. NGS data analysisprimer 设计请教
rrdw下sequencing中primer和adaptor的区别请教,这个图是用什么软件画的?
[转载] SARS is sequenced请问大家美国哪里454测序比较便宜
(急)求助:要submit sequence data 到Genbank,怎么产生需要的feature table?16s rRNA sequences
科学家呼吁关注全球基因组数据库污染Contaminated genomes迷茫啊,各位给点建议吧
征Bioinformatics Analyst (转载)Bioinformatics 如何转行?
相关话题的讨论汇总
话题: results话题: sequencing话题: molecular话题: algorithms
进入Biology版参与讨论
1 (共1页)
c********e
发帖数: 598
1
madhadron: A farewell to bioinformatics
I’m leaving bioinformatics to go work at a software company with more
technically ept people and for a lot more money. This seems like an
opportune time to set forth my accumulated wisdom and thoughts on
bioinformatics.
My attitude towards the subject after all my work in it can probably be best
summarized thus: “Fuck you, bioinformatics. Eat shit and die.”
Bioinformatics is an attempt to make molecular biology relevant to reality.
All the molecular biologists, devoid of skills beyond those of a laboratory
technician, cried out for the mathematicians and programmers to magically
extract science from their mountain of shitty results.
And so the programmers descended and built giant databases where huge
numbers of shitty results could be searched quickly. They wrote algorithms
to organize shitty results into trees and make pretty graphs of them, and
the molecular biologists carefully avoided telling the programmers the
actual quality of the results. When it became obvious to everyone involved
that a class of results was worthless, such as microarray data, there was a
rush of handwaving about “not really quantitative, but we can draw
qualitative conclusions” followed by a hasty switch to a new technique that
had not yet been proved worthless.
And the databases grew, and everyone annotated their data by searching the
databases, then submitted in turn. No one seems to have pointed out that
this makes your database a reflection of your database, not a reflection of
reality. Pull out an annotation in GenBank today and it’s not very long
odds that it’s completely wrong.
Compare this with the most important result obtained by sequencing to date:
Woese et al’s discovery of the archaea. (Did you think I was going to say
the human genome? Fuck off. That was a monument to the vanity of that god-
bobbering asshole Francis Collins, not a science project.) They didn’t
sequence whole genomes, or even whole genes. They sequenced a small region
of the 16S rRNA, and it was chosen after pilot experiments and careful
thought. The conclusions didn’t require giant computers, and they didn’t
require precise counting of the number of templates. They knew the
limitations of their tools.
Then came clinical identification, done in combination with other assays,
where a judicious bit of sequencing could resolve many ambiguities.
Similarly, small scale sequencing has been an incredible boon to
epidemiology. Indeed, its primary scientific use is in ecology. But how many
molecular biologists do you know who know anything about ecology? I can
count the ones I know on one hand.
And sequencing outside of ecology? Irene Pepperberg’s work with Alex the
parrot dwarfs the scientific contributions of all other sequencing to date
put together.
This all seems an inauspicious beginning for a field. Anything so worthless
should quickly shrivel up and die, right? Well, intentionally or not,
bioinformatics found a way to survive: obfuscation. By making the tools
unusable, by inventing file format after file format, by seeking out the
most brittle techniques and the slowest languages, by not publishing their
algorithms and making their results impossible to replicate, the field
managed to reduce its productivity by at least 90%, probably closer to 99%.
Thus the thread of failures can be stretched out from years to decades,
hidden by the cloak of incompetence.
And the rhetoric! The call for computational capacity, most of which is
wasted! There are only two computationally difficult problems in
bioinformatics, sequence alignment and phylogenetic tree construction. Most
people would spend a few minutes thinking about what was really important
before feeding data to an NP complete algorithm. I ran a full set of
alignments last night using the exact algorithms, not heuristic
approximations, in a virtual machine on my underpowered laptop yesterday
afternoon, so we’re not talking about truly hard problems. But no, the
software is written to be inefficient, to use memory poorly, and the cry
goes up for bigger, faster machines! When the machines are procured, even
larger hunks of data are indiscriminately shoved through black box
implementations of algorithms in hopes that meaning will emerge on the far
side. It never does, but maybe with a bigger machine…
Fortunately for you, no one takes me seriously. The funding of molecular
biology and bioinformatics is safe, protected by a wall of inbreeding,
pointless jargon, and lies. So you all can rot in your computational shit
heap. I’m gone.
about | feed
f*****h
发帖数: 228
2
Really old article. It makes some sense but I guess it depends how you do
science...

best
.
laboratory

【在 c********e 的大作中提到】
: madhadron: A farewell to bioinformatics
: I’m leaving bioinformatics to go work at a software company with more
: technically ept people and for a lot more money. This seems like an
: opportune time to set forth my accumulated wisdom and thoughts on
: bioinformatics.
: My attitude towards the subject after all my work in it can probably be best
: summarized thus: “Fuck you, bioinformatics. Eat shit and die.”
: Bioinformatics is an attempt to make molecular biology relevant to reality.
: All the molecular biologists, devoid of skills beyond those of a laboratory
: technician, cried out for the mathematicians and programmers to magically

a***y
发帖数: 19743
3
垃圾生物学数据是很多。
垃圾软件也很多。微软的Windows,Oracle的Java,无数小软件公司的软件,垃圾。天
天要fix bug,天天0day。
所以这不是bioinformatics的问题,而是人类的问题。和人类沾边的,都跑不了这些本
质上由人的limitation带来的问题。

best
.
laboratory

【在 c********e 的大作中提到】
: madhadron: A farewell to bioinformatics
: I’m leaving bioinformatics to go work at a software company with more
: technically ept people and for a lot more money. This seems like an
: opportune time to set forth my accumulated wisdom and thoughts on
: bioinformatics.
: My attitude towards the subject after all my work in it can probably be best
: summarized thus: “Fuck you, bioinformatics. Eat shit and die.”
: Bioinformatics is an attempt to make molecular biology relevant to reality.
: All the molecular biologists, devoid of skills beyond those of a laboratory
: technician, cried out for the mathematicians and programmers to magically

x******m
发帖数: 736
4
写这篇文章的人就是个大loser啊
l****m
发帖数: 751
5
Re

【在 x******m 的大作中提到】
: 写这篇文章的人就是个大loser啊
1 (共1页)
进入Biology版参与讨论
相关主题
Bioinformatics 如何转行?[转载] SARS is sequenced
各位生物背景的TZ们,System Biology方向求教(急)求助:要submit sequence data 到Genbank,怎么产生需要的feature table?
求推荐基础生物信息学书籍科学家呼吁关注全球基因组数据库污染Contaminated genomes
求Bioinformatics Scientist/Computational Biologist职位的refer征Bioinformatics Analyst (转载)
A farewell to bioinformatics zzBioinformatics engineer position opening
bioinformatics postdoc poition($35,000 - $40,000)为什么说进化是一门主流科学
[bssd] 求建议:Comparative genomics vs. NGS data analysisprimer 设计请教
rrdw下sequencing中primer和adaptor的区别请教,这个图是用什么软件画的?
相关话题的讨论汇总
话题: results话题: sequencing话题: molecular话题: algorithms