由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
Programming版 - Cloudera pitches Hadoop for everything. Really?
相关主题
试了下spark,不过如此啊搞不懂为什么大牛说Hbase不如C*?
天天嚷嚷这个 out 那个out的真是有病Hadoop/HBase/HDFS三驾马车过时了吗?
3哥搞open source的多不多?有研究Cloud Foundry的吗?
哪里有便宜的hadoop计算平台ibm要雷四分之一人力?
Hadoop运行时是不是用命令行执行的?Hadoop和Java有什么联系?node.js 0.12 is releasing
关于 SPARK, 问二爷peking2 和其他大牛一问题Java 9 Date is set
Hadoop 和Python的数据分析包哪个更值得学习?这个面试题很confusing 大家帮忙解释一下
感觉flink出来之后,hadoop就显得不怎么再需要了any cloud computing daniu? some baozi for help.
相关话题的讨论汇总
话题: hadoop话题: cloudera话题: data话题: everything话题: oracle
进入Programming版参与讨论
1 (共1页)
w**z
发帖数: 8232
1
http://www.infoworld.com/t/hadoop/cloudera-pitches-hadoop-every
When you have a big enough hammer, everything begins to look like the same
kind of nail.
That's one of the potential problems with Hadoop 2.0, the greatly reworked
big data processing framework that's been at the center of a whole storm of
developer and end user interest. Cloudera in particular has plans to make it
into a hammer for all kinds of nails.
There's no question that Hadoop 2.0 is a major leap over its predecessor.
Instead of being a mere batch data processing framework for MapReduce jobs (
limited, boring), it's now turned into a general framework for deploying
applications across a multi-node system, with MapReduce just being one of
the many possible things that can be run across those nodes (flexible,
exciting).
Cloudera's clearly excited by the possibilities inherent in such an
arrangement. During a keynote presentation at the O'Reilly Strata-Hadoop
World conference in New York City this past Tuesday, the company described
an "enterprise data hub" powered by Hadoop, one where all manner of data
could be funneled in, processed in place, and extracted as needed.
Sounds great, but how feasible is it? Especially given Hadoop's status as
the shiny new big data toy on the block? Such a hub may be a long way off
for any company that's late to the big data party and has only just now
found a place forits multi-mega-terabyte data farms to live. Turning those
"silos" (as Cloudera refers to legacy data repositories, with a near-audible
sniff) into Hadoop installations isn't trivial.
The single biggest obstacle to making all that happen isn't Hadoop itself,
although that's still a fairly major obstacle. In talking with vendors and
users alike at Strata-Hadoop, it's clear Hadoop is still seen on all sides
as a bucket of parts that needs major lifting and welding to be fully useful.
The most fruitful uses of Hadoop have been through the third parties that
turn it into a ready-to-deploy product -- not just Cloudera or its quasi-
rival Hortonworks, but cloud providers like Microsoft (a major Hortonworks
partner), Amazon, SoftLayer, Rackspace, and just about every other name-
brand cloud outfit. And few of them have yet to offer the kinds of really
high-level abstraction we associate with powerful software tools, where the
likes of Puppet or Python scripting are options rather than requirements.
The sheer number of moving parts and pointy edges that pop up out of Hadoop,
even for smaller deployments, is still intimidating. A panel given by Dan
McLary (principle product manager, Oracle) about Oracle building Hadoop
appliances shed a lot of light on how much blood has to be shed, even by the
likes of Oracle, to make Hadoop into a deliverable product. McLary was
fairly sure over time Hadoop's rough edges would get sanded down by back-
pressure from the community and vendors alike, but that time had definitely
not arrived yet.
But the single biggest obstacle remains moving apps into Hadoop. The new
infrastructure within Hadoop for applications, YARN, is far more open-ended
than before, but it isn't trivial to rewrite an application to run there. It
's not impossible there could be jury-rigs to accelerate that process -- e.g
., some kind of virtualization wrapper that would allow apps to be
arbitrarily shoehorned into the framework -- but that's not trivial work
either.
Small wonder, then, that a great deal of work right now is being done to
make Hadoop play well with existing apps -- connectors, data funnels, and
the like. Very little of the discussion I encountered focused on moving
existing apps into Hadoop, although few disagreed that it would happen
eventually; most of it revolved around taking one's existing analytics and
connecting them to Hadoop. There are, I imagine, far more people who want to
do that than there are people who want to scrap everything and start over.
That said, the sheer level of bustle at the O'Reilly conference was a tipoff
as to how soon that might happen. By this time next year, when the
conference moves to the far-larger Javits Convention Center in Manhattan,
some of Cloudera's pronouncements may seem a little less wildly optimistic.
But until then, the trend right now is toward using Hadoop as a complement
to existing big-data systems, not as a forklift upgrade for them.
This story, "Cloudera pitches Hadoop for everything. Really?," was
originally published at InfoWorld.com. Get the first word on what the
important tech news really means with the InfoWorld Tech Watch blog. For the
latest developments in business technology news, follow InfoWorld.com on
Twitter.
1 (共1页)
进入Programming版参与讨论
相关主题
any cloud computing daniu? some baozi for help.Hadoop运行时是不是用命令行执行的?Hadoop和Java有什么联系?
真神,原来amazon cloud的底层就是soa架构关于 SPARK, 问二爷peking2 和其他大牛一问题
处理海量csv数据+socket data stream processing: scala还是clojureHadoop 和Python的数据分析包哪个更值得学习?
学了一个周末Cassandra,顺利拿到certificate感觉flink出来之后,hadoop就显得不怎么再需要了
试了下spark,不过如此啊搞不懂为什么大牛说Hbase不如C*?
天天嚷嚷这个 out 那个out的真是有病Hadoop/HBase/HDFS三驾马车过时了吗?
3哥搞open source的多不多?有研究Cloud Foundry的吗?
哪里有便宜的hadoop计算平台ibm要雷四分之一人力?
相关话题的讨论汇总
话题: hadoop话题: cloudera话题: data话题: everything话题: oracle