The HPC Asia 2009 keynote speech given by Dr. William T.C. Kramer introduces us to the NCSA beginning with its history in field of HPC and continuing on with its latest development, the Blue Waters project. The talk also examines the concepts behind the open science-based projects and applications that are slated to run over Blue Waters. Dr. Kramer then concludes with his insight into Petascale and Exascale era challenges of the future.
Part1 | Part2 | Part3 | Part4 | Part5 | Part6 | Part7 | Part8 [PDF Download] [Video]
Part 2:Characteristics of Open Science
Given that as an introduction to the NCSA, let me next talk about some of the characteristics of Open Science. Open Science systems are, in many ways, different than so many other technologies that you hear about in terms of their usage profile. And because of that, they end up meeting different types of requirements than some of the systems that you might see on the TOP500 list or some of the specialized systems that get talked about quite often.
One of the first characteristics of an Open Science system is its science profile, that is, the profile of the activities of those systems (unless they’re dedicated). What I’m talking about the general purpose systems such as those at Taiwan’s National Center for High Performance Computing (NCHC), at the National Energy Research Scientific Computing Center (NERSC), at NCSA, at Oak Ridge--the ones that have a “changing” workload. Here’s a graph from NERSC that shows the different types of disciplines that it has run over the last six years. It represents the fact that because of the scientific needs as well as the algorithmic changes that go on, the workload that runs here is not consistent. It will change. As an example, you can see here that we have fusion energy which is ramping down as materials and astrophysics ramp up. I’m sorry, astrophysics is the red line that was ramping down but just began to ramp back up just this year. The yellow line represents Lattice Gates Theory.
One of the characteristics of machines that need to support Open Science is that they are general enough that they can run different sets of applications simultaneously. That’s an important characteristic that makes these types of systems more of a challenge to effectively operate than some of the other systems that take only a subset, or one application area, or one or two codes, and work through that in order to optimize the machine for that purpose. We need flexible machines in Open Science--machines that can support a wide range of workloads. And those workloads change, in most cases, every year in terms of the allocation process, that is to say, who is qualified and how the allocation process awards time to different individuals.
Another aspect of Open Science is the fact that the algorithm space is also quite diverse. The rows on the chart here represent different types of science areas from nano science, chemistry, climate, combustion, astrophysics, etc. Most likely, some of the work that you do relates to one or more of these areas. Across the top here, we have different types of algorithms or methodologies used in that particular field of science. The Xs indicate that that algorithm type, that methodology, is actually used in a significant way in the science areas.
The first thing we notice here on this chart is that the matrix is reasonably dense. So if you’re going to support all of Open Science or a large subset of Open Science, you need machines that are able to support different algorithmic methods. Now these methods may exist in one code. Many codes have two, three, or even four methodologies based on the approach that is being used in those codes. It may have, for example, combinations of FFTs and Dense linear algebra (DLA) in the same code. Or, it may be that the science area has multiple codes. Most of these science areas have two or three major community codes in them now and they may approach things much differently in terms of the algorithmic space that they employ. And if you’re supporting a range of users, over time, you’ll see all these types of algorithms in your computing facility and on your computing machines.