由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
JobHunting版 - Inside Uber’s Engineering Struggles
相关主题
招人 - Uber NYCThe Guardian:Uber文化是员工简历上的污点
uber 又圈了1.2B, 估值40B[超爆 ! ]Uber总裁 被废
Uber’s CFO is leaving… Let the IPO speculation beginUber再曝丑闻 CEO和女友组织了一场韩国三陪KTV之旅
Uber 在华遭打脸:或成为下一个Groupon?多元化行为艺术 (转载)
滴滴宣布收购uber中国!重磅!我艹,Uber的经历在简历上变成负资产了!
Uber的同胞应该是会发大财了Uber新考题 lowest common partner (转载)
[视频] 卡拉尼克舌战汽车司机包容性和多元化成为Uber优先事务
Uber CEO 坐 Uber 和 Uber 司机吵架 (转载)作弊毁掉Uber的10招 (转载)
相关话题的讨论汇总
话题: uber话题: mr话题: pham话题: he话题: engineers
进入JobHunting版参与讨论
1 (共1页)
c******n
发帖数: 4965
1
刚爆出的料, uber 真是糙快猛的代表啊
https://www.theinformation.com/inside-ubers-engineering-struggles?unlock=
22c087&token=4fa8aaabbd314173f46a69e9c2b4db43226d3169
The Information
Research Topics
The Team
Our Subscribers
Welcome, John
Semil Shah shared this article from The Information with you. For access to
more exclusive news, interviews, analysis and subscriber-only events,
subscribe to The Information.
Subscribe Now
EXCLUSIVE
Inside Uber’s Engineering Struggles
Inside Uber’s Engineering Struggles
Uber CTO Thuan Pham. Photo courtesy of Uber.
By Amir Efrati
Sep. 21, 2015
7:06 AM PDT
Mentions Uber, Thuan Pham and 8 others
May 13 was one of Uber’s darkest days.
The computer system used by its business-operations employees ground to a
halt, for the whole day. The cause: a bug in one of Uber’s databases, which
set off a chain reaction that took down other systems and was made worse by
human errors in response to the cascading problems, according to internal
emails reviewed by The Information.
Uber’s chief technology officer, Thuan Pham, later wrote to his staff that
the mistake “reflects an amateurism with our overall engineering
organization, its culture, its processes, and its operation.”
The Takeaway
It may not be obvious to the outside world, but Uber’s technical
infrastructure has been hanging by a thread for years. Internal emails and
interviews with people who’ve worked on that system show that as CEO Travis
Kalanick and operations executives pressed engineers to build new features
for the business, it came at the expense of stability in the back-end
systems that power them. Now it’s up to the company’s inspirational CTO,
Thuan Pham, to professionalize a fast-growing engineering team that’s been
held back by what he calls “amateurism” under his watch.
The “massive outage,” which stretched to the following day but didn’t
affect users of the ride-sharing app, was symptomatic of an engineering
organization that had long struggled to keep Uber’s infrastructure together
as the business has grown. Uber is driven by its operations groups, which
includes the people who oversee each of the cities where Uber operates, as
well as the people who monitor and analyze Uber’s performance in real time,
handle customer service for riders and deal with the needs of drivers.
Their needs for new functions, such as support for certain currencies or new
tools to help drivers, had made it harder for the engineering group to
focus on revamping and fortifying the systems that powered those functions.
Early on in the company’s life, CEO Travis Kalanick pushed engineers to
take risks in the interest of furthering the goals of the business, and he
believed in the common engineering phrase “Move fast and break things,”
say several people who've worked with him. In other words, new functionality
trumped reliability.
Mission: More Nines
So far, Uber has managed to overcome its technical challenges, attaining a $
50 billion private valuation as it has grown to dominate ride-sharing in
many places around the world. It now operates in more than 300 cities,
handling several million rides a day. But its focus on features over server
stability could threaten its ability to expand further, including into new
businesses like food delivery and new markets like China. There it has to
operate a duplicate of the systems it built for the rest of the world, so
that the government can have access to data on Chinese customers. From a
technical perspective, the company needs to remake itself in order to avoid
disasters that will hurt its revenue, brand and partners.
That challenge rests on the shoulders of Mr. Pham, a 47-year-old Vietnamese
immigrant who worked in ad tech and at VMWare before joining Uber in 2013.
In an interview with The Information, Mr. Pham says the company’s goal is
to get to 99.99% reliability, or “four nines,” in the next 12 months. That
means about four minutes of down-time per month, or 50 minutes a year.
Uber is a “utility,” says Ganesh Srinivasan, who reports to Mr. Pham, and
“We have to provide a highly reliable service. But it’s extremely hard.”
It doesn’t help that Uber is growing so fast that every three months, what
was peak traffic becomes its average traffic.
Unanimously described by colleagues as straight-talking, serious and humble,
the bespectacled Mr. Pham has had to take personal responsibility for the
engineering group’s failings, including the one from four months ago, in
front of the whole company. In an email to the company, Mr. Pham said then
that he had been “deathly afraid” that the consumer-facing app would also
go down for a prolonged period of time. (The disruption hurt internal
systems but didn’t affect riders.) “It is simply unacceptable for us to
make this type of mistake,” he said.
Even as he constantly plays defense, he’s invested in a better technical
foundation. Since he’s joined, Uber’s engineering staff has grown from 40
to about 1,200, or one quarter of the company’s workforce.
Risk-taker, Firefighter
He’s taken bold bets that sometimes backfire. For instance, unhappy with
its existing Web hosting provider, the engineering group set up servers in a
new data center last year, hours before Halloween, the second-busiest night
of the year for the Uber app. (New Year’s Eve is No. 1.) When trick-or-
treat traffic was dumped on the new servers, many different systems failed,
causing widespread outages and a “tense night” of fire-fighting at Uber HQ
, say people who are familiar with the incident. The company had to resort
to the old Web hosting provider.
He and others say that without taking such a risk, Uber would have been
woefully underprepared for when growth spiked several months later, on New
Year’s Eve.
“We failed early; we learned fast and as a result; New Year’s Eve was
flawless, with around 1.7 million trips,” or nearly double the volume of
Halloween, Mr. Pham says.
Amid outages that required significant work overnight and his personal
supervision, Mr. Pham has been caught sleeping in the office a couple of
times, in one of the booths that have small beds behind the company’s
engineering “war room” or on a bean bag chair in a conference room. In
general, though, on weekdays he’s up at 6 a.m., drops his only child off at
school and takes a two-hour trip north from his home in east San Jose to
Uber in downtown San Francisco by train. Then he walks 20 minutes to his
office on Market Street. He typically stays until 6 p.m. or 7 p.m. before
heading home, and he often works after dinner. Because of his long daily
walks, he almost always wears sneakers.
Mr. Pham says he’s “very introverted,” but he’s also known to carry
around a DSLR camera at office parties and offsite meetings and snap
numerous pictures of colleagues, along with their dates.
From Refugee to MIT
In a video that’s available to Uber employees, Mr. Pham recounts his
inspirational life story. He immigrated to the U.S. in 1980, when he was 12.
In May 1979, several years after the Vietnam war ended, his mother took him
and his brother out of the country in the hopes of building a better life
for them, while his father, a former officer in the south Vietnamese army,
stayed behind because they couldn’t afford for the whole family to leave.
Mr. Pham didn’t see his father again for a decade.
The family spent almost a year bouncing between countries in southeast Asia,
including at refugee camps in Indonesia and Malaysia, before being allowed
into the U.S. Mr. Pham grew up in Rockville, Maryland, near Washington, with
his mother, brother and another immigrant family inside a crummy two-
bedroom apartment in a "bad part of town." His mother, who didn’t know
English at the time, had two jobs, including bookkeeping at a gas station
and bagging groceries. She was earning minimum wage, Mr. Pham says. A friend
of his at middle school had an IBM computer and Mr. Pham became acquainted
with programming. “I like things that are orderly,” he says.
He volunteered at his church, where he got to know a congregant who was a
director at the National Bureau of Standards, an arm of the federal
government. He then volunteered for that bureau and reprogrammed its back
end systems.
“I always think of myself as an underdog, having come from nowhere,” Mr.
Pham says. His mom “gave up everything, her whole life,” so Mr. Pham felt
he “had to make something of myself.”
His government work and straight A’s in school helped him get accepted to
the Massachusetts Institute of Technology. After getting his bachelor’s and
master’s degrees in electrical engineering and computer science, Mr. Pham
moved west to work for an R&D arm of Hewlett Packard. Later, he joined
NetGravity, where he helped develop advertising technology in the mid- to
late-1990s. He stayed for three more years after the firm sold to a
competitor, DoubleClick.
John Danner, the founder of NetGravity, says in an interview that it was a
mistake not to promote Mr. Pham to be vice president of engineering at the
startup, where at one point he managed around 20 engineers. “He’s a
natural manager,” Mr. Danner says.
Like many engineers, Mr. Pham a tinkerer and built things like his own
barbecue at home. He’s also rabidly curious. After Mr. Danner’s wife, then
a U.S. Supreme Court clerk, let Mr. Pham sit through a public oral argument
there, he kept researching the obscure case—which was related to gambling
in Mississippi—in order to understand how the court would weigh its
decision.
After a stint at a computer security startup, Mr. Pham spent nine years at
VMWare, which helps companies make more efficient use of computer hardware
to run their applications. By the end, he oversaw hundreds of engineers and
helped run products including vCenter, which was the interface through which
customers interacted with other VMWare services.
“It’s where all the money comes in, and it was a high stress job,” says
Steve Herrod, a former colleague. “His team was the beaten horse for every
single function that has to get out the door; it is a hard thing to do there
, and he did it well,” Mr. Herrod says.
Mr. Pham thus seemed prepared for the fast pace of Uber.
The Whole Stack
Bill Gurley, a venture capitalist who sits on Uber’s board of directors,
had been chasing Mr. Pham for 12 years after hearing about him from Mr.
Danner. Mr. Gurley says he had long wanted to place Mr. Pham in one of his
portfolio companies, but he couldn’t find a job that was enticing enough
for him, until Uber.
It’s hard to imagine Mr. Pham could have anticipated how tricky the role
would be.
Early on in the company’s life, CEO Travis Kalanick pushed engineers to
take risks in the interest of furthering the goals of the business, and he
believed in the common engineering phrase, “move fast and break things.”
From the beginning of Uber, Mr. Kalanick has refused to use a “public”
cloud provider like Amazon Web Services to host the app, in contrast with
other startups of its day, because he didn’t want to get “locked into” a
tech vendor and be dependent on it, according to Mr. Pham. Curtis Chambers,
who was Uber’s top engineering manager from 2010 until Mr. Pham joined,
says AWS is better suited for “volatile” traffic on an app’s systems,
whereas demand on Uber’s systems has been fairly predictable.
The decision not to use a public cloud meant Uber relied on smaller third
parties to manage servers for the company, and those firms weren’t always
able to handle Uber’s growth. (During some outages, Mr. Kalanick has gotten
mad enough to call out the main Uber infrastructure provider.) Under Mr.
Pham, the company later hired people to handle the servers and network
engineering, including buying and physically handling the machines that
power Uber, which are made by firms like Dell and Quanta Computer. Now the
company controls an entire technology “stack,” except for actually owning
a data center. It spends more than $10 million a year on data center-related
costs, estimates one person who is familiar with that unit at Uber.
Mr. Pham says Mr. Kalanick’s bet on owning the whole stack, despite the
hardships of gaining data center expertise, was “right” and has led to
reduced costs overall.
Get No Respect
The infrastructure that’s needed to run Uber (about 10,000 servers, give or
take) is miniscule compared to companies like Google and Facebook that
utilize millions or hundreds of thousands of machines. As a result,
engineering leaders at such companies often dismiss Uber’s challenges as
small.
But Uber doesn’t cache, or store, much information the way Google does with
search results or Facebook does with user-profile information. Uber is more
like a massive multiplayer online game (think “World of Warcraft”) but at
a bigger scale and with greater complexity because of mapping and other
calculations that need to be made instantaneously. And many of the features
are for employees or its contractor-drivers, meaning riders never see them.
For that reason, Uber engineers, including Mr. Pham, feel like their
infrastructure doesn’t get enough respect.
When he arrived at Uber, Mr. Pham says, it was obvious to him that the
infrastructure wasn’t prepared for growth. Uber has two main systems: one
that runs the dispatching of cars to customers and tracks both of them
throughout their trips, using software known as Node.js that’s built into
the app; and a back-end system that calculates fares, sends emails to
customers, and provides tools to Uber employees to analyze the business and
do customer service, all written in code known as Python.
The Node system, according to some engineers, has had issues because it isn
’t necessarily designed for large scale combined with heavy data processing
. The Python system has had its issues because it was a huge, “monolithic”
code base where it was difficult to pinpoint specific causes of technical
problems.
Making things worse, the engineers in charge of the two separate systems had
long been at odds, in part because of personality clashes and how each team
thought software should be written using their preferred language. Mr.
Thuan initially let the bad blood fester, as it allowed each group to work
in isolation and get more done. A side benefit was that if one system had
troubles, it didn’t immediately bring down the other, including during the
May crisis. But overall, the friction became problematic because the teams
ended up working on similar tools for software development that were written
in different languages. This also meant there was a duplication of
resources, but Mr. Pham was willing to overlook that because there were
bigger problems to solve.
He’s since moved to resolve that by moving more engineers to design
software development “platforms” that any Uber engineer could use to make
new products, no matter which team they were on.
Red Bull and Features
Then there was pressure from the top. Mr. Kalanick wanted his engineers to
move quickly despite the risks of pushing faulty software code that would
break the system from time to time. Mr. Kalanick didn’t believe in having a
separate team for QA, or quality assurance, to make sure code changes didn
’t cause new problems. “Travis believed the quality of your code is your
responsibility,” says one person who’s worked with him. Mr. Pham says he’
s proud that the company doesn’t have any QA engineers, which can often
cause friction in an engineering organization and make it harder to get
things done.
Mr. Kalanick and global operations chief Ryan Graves often pushed the
engineers to work at the business side’s pace, requesting features to
support new currencies and languages and vehicle types. And they wanted it
done yesterday. Because of such requests, “a lot of things happened outside
the scope of the [engineering] roadmap; engineers would go drink Red Bull
and work over the weekend to get things done,” says one person who observed
the situation.
It was difficult for the engineers to explain to the business executives
that adding more engineers to a project wouldn’t necessarily speed it up;
new features just take time. As a result, a lot of “bad code” was pushed
into production, the person said. And some engineers have gotten “burned
out” and needed extended breaks to recover.
For Uber’s engineers, Mr. Pham became a critical first shield against
Messrs. Kalanick and Graves and other general managers of specific cities
that wanted new features built just for their market.
But the pressure to serve the business needs has continued, and Mr. Pham has
also struggled with getting past the legacy technical problems. The May
crisis, coming more than two years after Mr. Pham arrived at Uber, is a
prime example.
The Big Crash
As the company was preparing databases for its new system in China, an
engineer introduced a bug that caused a “rapid depletion” of space on the
company’s “master” database, Mr. Pham wrote in a post-mortem summary,
which he shared with the rest of the company. (Databases hold information
like riders’ payment credentials.)
The depletion of space on the master database triggered an alert to the
engineering team, but the alert came much too late; there were only two
hours before the database crashed. An on-call engineer who received the
alerts “ignored” them, Mr. Pham said. After the main database crashed, an
engineer was supposed to take an uncorrupted database and turn it into the
new “master.” That person instead made an error and corrupted the new
master database and allowed the rest of the system to replicate that bug
across numerous databases “like a cancer that metastasizes quickly
throughout the body.” That was the “fatal blow” that caused the prolonged
outage, Mr. Pham said.
It took more than 24 hours to repair the databases and get back to normal.
In the meantime, the company couldn’t on-board new drivers and customer
service was inoperable, among many other problems. In China, Baidu was “
extremely upset” because the special integration between its apps and Uber'
s didn’t work, Mr. Pham said. The crash imposed a “huge amount of pain and
inconvenience for the rest of the business,” he said.
He later wrote to his engineers that “the first step to fixing any problem
is to acknowledge that we have a problem.” Uber has “many problems in
multiple levels of the org with respect to quality (code, testing,
monitoring, tooling, process, operation, etc.), including me for not having
pushed hard enough to establish the level of rigor and quality in the
tooling and operation of our services.”
To solve those problems, he promised to increase emergency-response training
(“just like airline pilots”) for all engineers and create a “site
reliability engineering” role within the group in order to operate Uber in
a more “professional manner.” The company also started a “zombie
apocalypse recovery toolkit,” which is a digital manual for how to fix
problems and keep things going when a specific system goes down.
He also continued a transition, which began before he arrived at Uber, to
what’s called “service oriented architecture.” SOA splits up all of a
company’s back end functions into isolated “micro services.” Doing so
allows one system to go down without impacting other systems. It’s still an
ongoing process, and Uber already has 450 micro services, Mr. Pham says.
You Don’t Know Git
Other big changes have been made. There’s now encrypted data being stored
on each driver’s phone (which are specially designed to run a driver’s
version of the Uber app) so that if a related back-end system goes down in
Uber’s servers, the phones can play a backup role. In addition, there’s a
new system called “uDestroy,” similar to Google’s “Dirt” and Netflix’s
“Chaos Monkey,” that deliberately and randomly causes system problems in
order to help the organization know how to deal with them.
Meanwhile, the quality of some new engineering recruits has fallen as hiring
has increased, and many hires are fresh graduates with no experience. “
People are graduating with CS degrees and don’t know how to use Git,”
which is a system for writing code with other engineers, says one person
familiar with Uber’s hiring. Uber says that less than 10 percent of its
hires over the past year were new college graduates, and the average amount
of experience for the other 90% has gone up significantly.
In response to the influx of graduates, the company has invested
considerably on training. The orientation program for new hires, known as “
Uberversity” for “nUbers,” has expanded for engineers, not only to
acquaint them with Uber’s back-end systems but also for general education.
There’s also a computer science curriculum, taught in part by contractors,
to help engineers write good Python code or to learn how to build iOS apps.
Mr. Pham says Uber cannot compete with Facebook and Google by throwing $100,
000 signing bonuses onto new recruits. And Uber’s engineering culture has
never been as extravagant as some other fast-growing companies: The food isn
’t as varied, and there aren’t as many toys and other distractions (
drinking from beer taps known as “uBeer” and engaging in Nerf gun battles
aren’t allowed before 6 p.m.) But this year he’s been able to poach some
senior people like Joe Sullivan, who was Facebook’s chief security officer,
and AG Gangadhar, who was a cloud platform manager at Google.
Many of those who’ve worked with Mr. Pham at Uber have a difficult time
saying whether he’s done a great job or not.
“He didn’t get crushed. The wheels didn’t fall off the bus,” one
colleague notes. “That was an achievement.”
This article has been updated with a comment from Uber about the experience
level of its recent hires.
RECENT ARTICLES
EXCLUSIVE
With Apple in Mind, Google Seeks Android Chip Partners With Apple in Mind,
Google Seeks Android Chip Partners
By Amir Efrati
Nov 05, 2015
EXCLUSIVE
What Google Wants for Android What Google Wants for Android
By Amir Efrati
Nov 05, 2015
Getting Smart Devices To Talk to Each Other Getting Smart Devices To Talk to
Each Other
By Reed Albergotti
Nov 05, 2015
The Information The Information has a simple mission: deliver important,
deeply reported stories about the technology business you won’t find
elsewhere. Many of the most influential people in the industry turn to us
for fresh information and original insight.
Terms, Privacy, Payment Policy
b**********5
发帖数: 7881
2
把python改成java, 我马上再去面
s*****m
发帖数: 8094
3
我一直以为是“快糙猛”啊
而且都几个月前了,刚个头阿

to

【在 c******n 的大作中提到】
: 刚爆出的料, uber 真是糙快猛的代表啊
: https://www.theinformation.com/inside-ubers-engineering-struggles?unlock=
: 22c087&token=4fa8aaabbd314173f46a69e9c2b4db43226d3169
: The Information
: Research Topics
: The Team
: Our Subscribers
: Welcome, John
: Semil Shah shared this article from The Information with you. For access to
: more exclusive news, interviews, analysis and subscriber-only events,

g*****g
发帖数: 34805
4
几乎所有startup都是这么走过来的吧。不过我觉得不用cloud实在不明智。lock in与
否对startup很重要吗,快糙猛才重要。你需要台机器还得先去买,怎么快。最后没办
法还是第三方托管,那还不如上AWS呢。等做到跟FG那么大,想招几个牛人从头写OS都
行,几个cloud API算啥lockin.
s*****m
发帖数: 8094
5
爱面不面

【在 b**********5 的大作中提到】
: 把python改成java, 我马上再去面
w**z
发帖数: 8232
6
that was a dumb decision using its own dc.

【在 g*****g 的大作中提到】
: 几乎所有startup都是这么走过来的吧。不过我觉得不用cloud实在不明智。lock in与
: 否对startup很重要吗,快糙猛才重要。你需要台机器还得先去买,怎么快。最后没办
: 法还是第三方托管,那还不如上AWS呢。等做到跟FG那么大,想招几个牛人从头写OS都
: 行,几个cloud API算啥lockin.

t*****d
发帖数: 525
7
最后一段似乎暗示这哥们要滚蛋了?

to

【在 c******n 的大作中提到】
: 刚爆出的料, uber 真是糙快猛的代表啊
: https://www.theinformation.com/inside-ubers-engineering-struggles?unlock=
: 22c087&token=4fa8aaabbd314173f46a69e9c2b4db43226d3169
: The Information
: Research Topics
: The Team
: Our Subscribers
: Welcome, John
: Semil Shah shared this article from The Information with you. For access to
: more exclusive news, interviews, analysis and subscriber-only events,

w**z
发帖数: 8232
8
那ignore alert 的和把corrupted db replicated, 应该被fire了

【在 t*****d 的大作中提到】
: 最后一段似乎暗示这哥们要滚蛋了?
:
: to

a****8
发帖数: 2771
9
太牛逼了,没有 QA。
1 (共1页)
进入JobHunting版参与讨论
相关主题
作弊毁掉Uber的10招 (转载)滴滴宣布收购uber中国!重磅!
Uber Founder Travis Kalanick ResignsUber的同胞应该是会发大财了
喜讯:某家ceo想去uber当ceo[视频] 卡拉尼克舌战汽车司机
uber 今天首页的新闻挺多.Uber CEO 坐 Uber 和 Uber 司机吵架 (转载)
招人 - Uber NYCThe Guardian:Uber文化是员工简历上的污点
uber 又圈了1.2B, 估值40B[超爆 ! ]Uber总裁 被废
Uber’s CFO is leaving… Let the IPO speculation beginUber再曝丑闻 CEO和女友组织了一场韩国三陪KTV之旅
Uber 在华遭打脸:或成为下一个Groupon?多元化行为艺术 (转载)
相关话题的讨论汇总
话题: uber话题: mr话题: pham话题: he话题: engineers