A****C 发帖数: 1 | | A****C 发帖数: 1 | 2 前十名排名
Here’s a summary of the systems in the Top10:
Fugaku remains the No. 1 system. It has 7,630,848 cores which allowed it to
achieve an HPL benchmark score of 442 Pflop/s. This puts it 3x ahead of the
No. 2 system in the list.
Summit, an IBM-built system at the Oak Ridge National Laboratory (ORNL) in
Tennessee, USA, remains the fastest system in the U.S. and at the No. 2 spot
worldwide. It has a performance of 148.8 Pflop/s on the HPL benchmark,
which is used to rank the TOP500 list. Summit has 4,356 nodes, each housing
two Power9 CPUs with 22 cores each and six NVIDIA Tesla V100 GPUs, each with
80 streaming multiprocessors (S.M.). The nodes are linked together with a
Mellanox dual-rail EDR InfiniBand network.
Sierra, a system at the Lawrence Livermore National Laboratory, CA, USA, is
at No. 3. Its architecture is very similar to the #2 systems Summit. It is
built with 4,320 nodes with two Power9 CPUs and four NVIDIA Tesla V100 GPUs.
Sierra achieved 94.6 Pflop/s.
Sunway TaihuLight is a system developed by China’s National Research Center
of Parallel Computer Engineering & Technology (NRCPC) and installed at the
National Supercomputing Center in Wuxi, China's Jiangsu province is listed
at the No. 4 position with 93 Pflop/s.
Perlmutter at No. 5 was newly listed in the TOP10 in last June. It is based
on the HPE Cray “Shasta” platform, and a heterogeneous system with AMD
EPYC based nodes and 1536 NVIDIA A100 accelerated nodes. Perlmutter improved
its performance to 70.9 Pflop/s
Selene, now at No. 6, is an NVIDIA DGX A100 SuperPOD installed in-house at
NVIDIA in the USA. The system is based on an AMD EPYC processor with NVIDIA
A100 for acceleration and a Mellanox HDR InfiniBand as a network. It
achieved 63.4 Pflop/s.
Tianhe-2A (Milky Way-2A), a system developed by China’s National University
of Defense Technology (NUDT) and deployed at the National Supercomputer
Center in Guangzhou, China, is now listed as the No. 7 system with 61.4
Pflop/s.
A system called “JUWELS Booster Module” is No. 8. The BullSequana system
build by Atos is installed at the Forschungszentrum Juelich (FZJ) in Germany
. The system uses an AMD EPYC processor with NVIDIA A100 for acceleration
and a Mellanox HDR InfiniBand as a network similar to the Selene System.
This system is the most powerful system in Europe, with 44.1 Pflop/s.
HPC5 at No. 9 is a PowerEdge system built by Dell and installed by the
Italian company Eni S.p.A. It achieves a performance of 35.5 Pflop/s due to
using NVIDIA Tesla V100 as accelerators and a Mellanox HDR InfiniBand as the
network.
Voyager-EUS2, a Microsoft Azure system installed at Microsoft in the U.S.,
is the only new system in the TOP10. It achieved 30.05 Pflop/s and is listed
at No. 10. This architecture is based on an AMD EPYC processor with 48
cores and 2.45GHz working together with an NVIDIA A100 GPU with 80 G.B.
memory and utilizing a Mellanox HDR Infiniband for data transfer. | S***C 发帖数: 1 | 3 我鳖考虑你们症屁生存不易,两台E级超算,现在不参加排名了。
Why Did China Keep Its Exascale Supercomputers Quiet?
Nicole Hemsoth Nicole Hemsoth
4 days ago
There are no greater bragging rights in supercomputing than those that come
with top ten listing on the bi-annual list of the world’s most powerful
systems – the Top500. And there are no countries more inclined to throw
themselves (and billions) into that competition this decade than the U.S.
and China. | S***C 发帖数: 1 | 4 Site iconThe Next Platform
Why Did China Keep Its Exascale Supercomputers Quiet?
Nicole Hemsoth Nicole Hemsoth
4 days ago
There are no greater bragging rights in supercomputing than those that come
with top ten listing on the bi-annual list of the world’s most powerful
systems – the Top500. And there are no countries more inclined to throw
themselves (and billions) into that competition this decade than the U.S.
and China.
Today, the latest results were announced (much more on those here) but
notably absent, aside from the expected first exascale machine in the U.S.,
“Frontier” at Oak Ridge National Laboratory, are China’s results, which
if published, would have shown two separate exascale-class machines.
This would have been a major mainstream news story had China decided to
publicize its results – and on several fronts.
The most obvious is being first to peak and sustained exascale with double-
precision floating point on the LINPACK benchmark (the metric by which
supercomputing performance is gauged). Further, this would have been
demonstrated on two separate systems with two separate homegrown processor
and accelerator architectures. Third, this would have meant several billions
in investments in supercomputing technology across two sites (hence serious
commitment from the Chinese government over the long haul).
All of this would have shown that despite its own billions in technology
investments in the last decade, the U.S. could not arrive first with
functional performance at exascale.
Yet China kept this quiet. Well, mostly.
Instead of the press-friendly, mainstream attention HPC gets twice each year
they quietly discussed the systems in papers showing real-world application
performance. And also, China made sure the word got out in other ways
beyond the Top500.
In late October, The Next Platform confirmed and reported that two separate
exascale supercomputers – the first with such capabilities in the world –
hit above both peak and sustained exascale performance according to LINPACK.
Since that time, many have wondered why China would choose not to publish
these results given the intensive, public rivalry to secure top system
status throughout the last decade.
When we first got word of benchmark results reaching exascale back in April
(the benchmark results came in in March, just before trade restrictions
cracked down on those exascale facilities and vendors, incidentally), the
first inklings came from a contact at a facility in China – one well known
to followers of the Top500. The conversation at that time was off record and
indicated displeasure that so much engineering work would not be recognized
globally, which means the decision to keep results quiet was made early, if
not in advance. It took another several months to get enough comprehensive
information for us to publish confirmation.
Ultimately, while China might have been able to knock the long-reigning #1
“Fugaku” powerhouse in Japan out of the running, that effect too might not
have the lasting impression China hoped for with these dual exascale
systems.
With Every Reason to Claim Bragging Rights …
All of this reminds us of all the many reasons China would have had to
publicize the results beyond the obvious – claiming the title on not just
one, but two, exascale machines. This would have made China the first in the
world to an HPC performance milestone that has been the subject of billions
of dollars of U.S. investments over the last several years.
A public announcement via the Top500 list in either its June edition or this
week would have also drawn attention to the significant material
investments China has made in homegrown semiconductor, networking, and
software technologies. Much more detail can be found by diving into the
Sunway and Phytium architectures and manufacturing backgrounds. And while
there are no “new” architectures with either exascale system, they do
represent a noteworthy scalability leap, in addition to noteworthy
performance in demanding HPC areas that also show the systems’ capability
to do mixed-precision (good for AI/ML) and tightly-coupled FP64-driven
traditional supercomputing.
Having an HPC complement to its existing large-scale compute infrastructure
among companies like Alibaba, Baidu, Tencent and others in China would be
another source of bragging rights. These companies are all pushing to build
their own native processors, accelerators, and software ecosystems. Having
the supercomputing/research side of native technologies would be further
signs of strength.
On that note, China would also be able to showcase systems that can handle
both general-purpose HPC as well as emerging AI. When results were released
for the quantum simulation work on the Sunway system, we believe China was
not just showing real-world, tightly coupled HPC performance, but also that
it could handle complex mixed precision workloads, which are common in AI (
FP16, Int-8, etc). In short, it would be touting both AI and simulation
capabilities – a valuable aspect for all emerging large systems – and all
without the conventional Nvidia or AMD GPUs as U.S. and European systems
deploy for AI, low precision capabilities.
And this may seem minor to those outside supercomputing – but think about
it: In addition to showing technological prowess and scalability of multiple
homegrown architectures, there is also the lost ability to show the hard
work on the part of teams in China, often over a thousand throughout an
entire cutting-edge system coming to life (manufacturers, designers,
architects, programmers, sysadmins, etc.). That these HPC professionals did
not have a chance to celebrate such a milestone on the international stage
is a shame. Heated disputes between nations or not, let’s not forget these
are people – many of whom have spent careers working toward this coveted
goal. This does matter, even if the bigger international picture obscures it.
Competitive Strategy, Perception, and Of Course, Politics
While we have not confirmed a direct, single reason, we have gathered a
multitude of views over the last couple of weeks from national lab HPC leads
in the U.S., Japan, and Europe, all of whom agreed the lack of
publicization is unexpected and baffling but is, generally speaking, purely
political. However, given the nuanced views politically and technically, we
do have some ideas.
As mentioned above, there could simply be some strategic silence on China’s
part for competitive purposes. The Chinese government, which backed these
systems to the tune of billions of dollars (not just the design and build
but ongoing facilities and power), likely had the final say in the strategic
announcement (or lack thereof) of the machines.
What is most interesting is that instead of listing on the Top500, the teams
confirmed the systems’ existence through Gordon Bell Prize paper
submissions. For reference, this is the most coveted award in supercomputing
beyond top system status via the Top500. With its submissions for the
Sunway system in particular, these submissions established the machines
exist and are in production as well as showcasing performance and
scalability – albeit with a cherry-picked set of applications.
That establishes that China was eager to show “real-world” production and
use of these systems over claiming the highly publicized top place on the
Top500 and crown for first to reach exascale. In short, they get the
recognition for technical merit without putting system specs out there for
LINPACK or the more real-world focused benchmarks in HPC like HPCG, Graph500
, or Green500.
Since China has built systems simply to game the Top500 in the past –
including a directly replicated AMD-ish looking system that was later
removed from list – one might say these exascale machines are a game. But
not so, according to those sources we spoke with for the original story
close to the benchmark results. In that case, this is legitimate, the
machines are highly capable, and that means the trade war – likely a big
part of this story – is also at the heart of this lack of publicizing
important results.
The timing on the most recent U.S. restrictions to bar relationships with
the labs and vendors behind both exascale systems came in April, a month
after benchmarks were run on each system. It is unclear whether the decision
to withhold reporting on the achievement was due to waiting for the June
Top500 list or for other reasons, but those we spoke with suspect the real
delay was to keep from being knocked off the number one spot too quickly by
the U.S..
The “Frontier” machine in the U.S. was expected to appear on today’s
Top500 rankings at the top of the list, well above either of China’s
systems. If China listed in June or for today’s list, assuming “Frontier”
had taken the slot followed by “Aurora” at Argonne (with projected 2
exaflops peak) it would only hold top placement for a relatively short time.
That’s important considering the lifespan of these large machines (five
years on average) and the potential for new machines to further supplant
China, pushing its systems further down the list.
The semiconductor shortage was not expected to impact big systems as much as
it did and China likely did not see “Frontier” being off the November
list for that reason.
One of the opinions we gathered about why China chose silence one stands out
as a bit “out there” on the surface but is worth repeating: if the U.S.
and Europe are hell-bent on rolling out several exascale-class systems in
the next three years, and China blew its budget on being first – and on two
systems to boot – it might be in its best interest to take its ball and go
home. In other words, if China “won’t play Top500” anymore, which has
long been a yardstick for national supercomputing competition, is that list
valuable any longer?
Put yet another way, by choosing to publish prize-geared papers using the
machines as a “soft announce” or running LINPACK and letting those results
“accidentally” slip without ever publishing, yes China loses the big
press day of the top system, but only this last time. The list as a metric
is no longer international in the way it’s been for years. The tit-for-tat
of top systems has bounced between the U.S. and China for years.
It’s hard to claim dominance when your only real contender won’t come to
the plate.
While the Top500 has driven architectures in its decades, from around 2008,
it drove competition between the U.S. and China in particular – and with a
fierceness that has finally resulted in a flame-out, this time by choice.
What is clear is that China has set itself on its own nationalistic
technological path. There are problems with that, not the least of which is
a lack of fabs and semiconductor manufacturing prowess. All of that lies
beyond its borders – for now (she said ominously). With multiple
architectural options to go with, a strong hyperscale base within China to
trade hardware and software tooling with, and all the political reason to
stay this course for the long term, the news China didn’t make during this
Top500 list is much bigger than any announcement it might have.
None of this bodes well for the future of the Top500 list, of course. While
its creators have been open about its shortcomings and have built companion
benchmarks like HPCG and HPC-AI, for instance, the double-precision floating
point metric is less important for bandwidth-limited real-world
applications. Even still, the announcement of each list has meant the world
pays attention to global supercomputing and that is a big deal – especially
for the national labs and organizations that rely on funding for the next
big machine. The international competition, especially between the U.S. and
China, has also highlighted the growing ambitions of both with HPC as a
touchstone topic.
We expect that the current TaihuLight and other Chinese systems on the list
will appear until they are decommissioned. And perhaps we won’t see any
other top ten-class machines from China for some time, perhaps years. Not
because it doesn’t have them, but because it will chose other paths to
publicizing.
Categories: HPC
Tags: China, Exascale, Sunway, TaihuLight, Tianhe-3
Leave a Comment
The Next Platform
Back to top | A****C 发帖数: 1 | 5 2年过去,兲朝的超算根本没有进行升级,这是非常奇怪的行为。
一般超算建成后,会进行升级,兲朝以前的超算都是这样做的。
自从奥巴马禁运兲朝超算芯片后,兲朝的“自主芯片”立刻不进行升级了。
兲朝的“自主芯片”就是打磨芯片,把歪果仁的芯片拿来打磨掉Logo,贴上神威/天河
。。标签。 | S***C 发帖数: 1 | 6 尼玛忘了你个轮子都不懂英文了。
: 2年过去,兲朝的超算根本没有进行升级,这是非常奇怪的行为。
: 一般超算建成后,会进行升级,兲朝以前的超算都是这样做的。
: 自从奥巴马禁运兲朝超算芯片后,兲朝的“自主芯片”立刻不进行升级了。
: 兲朝的“自主芯片”就是打磨芯片,把歪果仁的芯片拿来打磨掉Logo,贴上神威
/天河
: 。。标签。
【在 A****C 的大作中提到】 : 2年过去,兲朝的超算根本没有进行升级,这是非常奇怪的行为。 : 一般超算建成后,会进行升级,兲朝以前的超算都是这样做的。 : 自从奥巴马禁运兲朝超算芯片后,兲朝的“自主芯片”立刻不进行升级了。 : 兲朝的“自主芯片”就是打磨芯片,把歪果仁的芯片拿来打磨掉Logo,贴上神威/天河 : 。。标签。
| c****g 发帖数: 37081 | 7 TG历史上一贯的吹牛逼!
老将倭杂疲沓魍鹜七尾腥胱泼熊舔仔 | P****R 发帖数: 22479 | 8
【在 c****g 的大作中提到】 : TG历史上一贯的吹牛逼! : 老将倭杂疲沓魍鹜七尾腥胱泼熊舔仔
|
|