The HPC Asia 2009 keynote speech given by Dr. William T.C. Kramer introduces us to the NCSA beginning with its history in field of HPC and continuing on with its latest development, the Blue Waters project. The talk also examines the concepts behind the open science-based projects and applications that are slated to run over Blue Waters. Dr. Kramer then concludes with his insight into Petascale and Exascale era challenges of the future.
Part1 | Part2 | Part3 | Part4 | Part5 | Part6 | Part7 | Part8 [PDF Download] [Video]
Part 5:Blue Waters
So, now I’d like to talk a little bit about the Blue Waters project. The NSF is sponsoring and funding the Blue Waters project. Blue Waters focuses on sustained performance, and the focus on sustained performance is in a way that few have ever done before and it’s absolutely one of the reasons that I was attracted to join the project at Blue Waters. Sustained performance, by our definition, is the computer’s performance on a broad range of applications, that range of science applications that I showed you, because, again, we don’t know which ones are going to be deployed on a mission. So it has to have a balance that allows you to tune those applications in the appropriate way. But it’s also for the applications that scientists use every day, that is, the applications and the system configurations as it operates day in and day out. So the concept of sustained performance is similar to the sustained system performance metric that we use that NERSC. It’s a composite metric. There’s no one code that can tell you what it is rather, it’s a set of codes. It’s based on time disillusion for those codes. It’s not based on a flop rate even though it is often expressed in flop rates. It’s basically the time spent solving very large engineering and science projects and that’s the metric that Blue Waters will stand up to and that’s the metric that Blue Waters will meet at a sustained performance across a range of applications. It will probably be the first system to be able to do that at least for Open Science.
So before we go into the details of Blue Waters, let me first mention the focus of the NSF in the U.S. has towards computing programs. You’ve heard of some of these machines coming out. About four years ago, the NSF initiated a high performance computing initiative that had three tracks in it. The first track was Track 3 systems, and that’s where leading university HPC centers and local university HPC centers have been raising the amount of computational capability they possess. This was partially self funded and partially funded by small grants. So computational capability has been increasing since fiscal year 2007. Starting in fiscal year 2007, there were a series of procurements that had been awarded to different organizations in the United States for different computational systems called Track 2. The funding for these systems is set at a certain level, on the order of tens of millions of dollars, and that’s the hardware funding with operational funding about equivalent to that. For the fourth generation of this, they have taken a slightly different approach on to how it is awarded. One went to Texas, one is at the University of Tennessee and, as Jack mentioned, one is now in the process of being deployed at the Pittsburgh Supercomputing Center in Pittsburgh. By the way, this is an active competition.
The final track, Track 1, is what we’ll talk about with Blue Waters. That was again a competition for a number of different organizations to propose different types of computing architectures based on sustained performance and it was awarded to the University of Illinois about eight months ago. So this relationship now has a series of what you might call “mid-rank” systems, all of which would be listed within the top 10 of the TOP500 list, that are feeding a very large system which will be deployed in 2011. And that, in turn, will build up a computational science community that is able to scale their applications across this wide range in a very effective manner. And indeed, the NSF has a parallel activity that they are soon to announce for making allocation awards and some funding to science projects that would allow them to work and gradually scale their applications to these sized machines and then upscale upwards to the full Petaflop of Blue Waters.
Here are the details of the three centers that have been awarded so far are. As you see, there is a machine called “Ranger” at the Texas Advanced Computing Center. The vendor is Sun and it contains AMD CPUs and it has 60,000 or so cores in it. The “Kraken” machine is a Cray machine over at the University of Texas, Austin that has on the order of 100,000 cores. And you can see corresponding amounts of memory and storage for both those machines. Pittsburgh is a work progress so we don’t have the details of that machine yet but we know it’s Intel-based and the vendor is SGI. We think it’s about 100,000 cores. So let’s put all this in context. Three new machines that are all relatively new. They’ll have between 60 and 100,000 cores each and they’re all running on the order of a .5 to a Petaflop of peak performance.
Let’s talk about Blue Waters first. What science will Blue Waters be able to do? I already told you that it needs to be able to do basically the entire range of science but the NSF has certain applications areas also that we focused on. One of those is molecular science and molecular dynamics. Another is climate and weather forecasting which is increasingly important these days. Health and health services is another one that shows the spread of a hypothetical disease. Another one is earth science which includes everything from earthquake prediction to flooding. Astronomy is another area of focus. These are examples of science and engineering disciplines that need this broad-based approach to HPC.
The selection criteria for Blue Waters was maximum core performance. So the desire to have within a reasonable price range, a minimum number of cores, to help people make use of this, with a given level of performance. What this means is we were looking for things that are fast, but fast in a usable and measurable manner for applications. It is made up of a low latency high bandwidth memory subsystem. A subsystem that is as close to balanced with the processor speed as you can get with the technology that he have available to enable memory intensive problems to be solved, for example, things like sparse linear systems, to be solved more readily. A low latency, high bandwidth communication subsystem that would facilitate the scaling to large numbers of cores in a sustained manner. And also high bandwidth I/O subsystems. I/O is the least acknowledge that but most critical part of the systems. Moving I/O and data around the systems is very critical. As a matter of fact, at the Exascale level, people are predicting that the data movement is the thing that will be dominating the cost factor; not the number of circuits or things like that. So this becomes much more important. And then, being able to integrate the system. Being able to operate it appropriately for 4 to 5 years after it’s deployed in a very highly reliable manner with very high qualities of service and support is another aspect of what the award was based on. There were four different systems proposed by four different organizations by four different vendors for this challenge.