2012 Bridging Big Data Infrastructure Workshop - Expediting the Network Science Landscape 2012/12/10

From December 3rd to 6th, 2012, the National Center for High-Performance Computing (NCHC), under the National Applied Research Laboratories (NARL), hosted the “2012 Bridging Big Data Infrastructure Workshop - Expedition on the Network Science Landscape” at the NCHC's Taichung Branch and the Huisun Forest of National Chung Hsing University (NCHU). At this workshop, researchers from all over the world gathered to explore the issues surrounding the use of dynamic, complex, and very large amounts of data involved in the fields of environmental ecology and biology.  Also included in the talks was the integration and sharing of such data.

The groups represented at the workshop included the Global Lake Ecological Observatory Network (GLEON), the Pacific Rim Applications and Grid Middleware Assembly (PRAGMA), and the National Ecological Observatory Network (NEON).  Additional groups represented included the Data Observation Network for Earth (DataONE), Taiwan CODATA, the Taiwan Integrated Earth Observation System Forum (Taiwan TIEOS), and the Taiwan Biodiversity Information Facility (TaiBIF).  Also present to discuss basic theories surrounding the application of big data were representatives from the Pacific Northwest National Laboratory (PNNL), Duke University Medical Center, the Artificial Intelligence Applications Institute, the University of Edinburgh (AIAI), and the High Performance Computing Center Stuttgart (HLRS).

Left to right, Dr. Jee-Gong Chang, Deputy Director of the NCHC, Dr. Peter Arzberger, Chairman of PRAGMA, Prof. Paul Hanson, Chairman of GLEON, and Dr. Der-Tsai Lee, President of NCHU delivering their opening speeches.

The first two days of the workshop were held at the NCHC's Taichung Branch whereas the third and forth days were held at NCHU's Huisun Forest. Based on the current trends and challenges facing the fundamental structure of big data, experts from Taiwan and the United States, as well as PRAGMA and GLEON's Asian and European scientists, presented their latest R&D achievements, future plans, and anticipated benefits. The content of presentations included a general introduction to the development of big data and scientific discoveries as well as an introduction to applications developed for the Environmental Observation Network.

The global platforms built by PRAGMA and GLEON were the focus of discussions regarding the development of core technologies for the fundamental structure of big data. The individual and common needs of GLEON, biodiversity, and disaster prevention were analyzed in order to generate the needed core technologies. These technologies were then sequenced in order of development so that they may serve as references for subsequent projects.

On the first day of the workshop, training courses on big data were offered. The courses were attended by researchers and students in HPC, networking, systems technologies, earth sciences, IT, electrical engineering, and ecological biology.

A workshop session and a break-out session in progress at Huisun Forest

The presenters shared various technological development and scientific discoveries that serve to further expose Taiwan to leading international technologies and application system integration for big data. Through hosting this workshop, the NCHC continues its ongoing collaborations with international organizations such as PRAGMA and GLEON. The NCHC will also intensify its strategic link with the National Science Foundation (NSF) for advancing the planning and future development of core technologies for big data. A total of 110 people from 36 academic organizations in 6 countries attended this event.

About Big Data
In the field of supercomputing, data intensive computing is data-centric HPC. Since 2000, the deployment of fiber optic networks, the widespread use of large-scale data storage, and the development of computer hardware and software, have all prompted the development of technologies such as co-laboratories, metacomputing, and grids.  These technologies allow for the sharing, integrating, and application of multi-national and multi-organizational resources. 

Bottlenecks in platform and applications development have recently shifted to large data. Issues regarding the large quantity of complicated, heterogeneous, and dynamic data are being discussed across all fields of science. Due to thousands of Tera bytes worth of non-structural, heterogeneous, and dynamic data, computing has shifted from system/computer-centric to data-centric.

Very large amounts of data, also known as "big data," broadly impacts national competitiveness. In April, 2012, the White House used a special budget to begin research and development on big data for the nation's most vital R&D organizations.  It tasked administrative departments to cooperate according to their respective characteristics. The National Science Foundation (NSF) and the National Institutes of Health (NIH) were responsible for the development of core technologies and talent cultivation.

The NCHC is the nation's center for computing, storage, networking resources, and R&D. It is in sync with the NSF's forward planning and has long been working with R&D organizations in the United States. At the same time, the NCHC has also continued to play important roles in large cooperative networks of international organizations such as the Pacific Rim Applications and Grid Middleware Assembly (PRAGMA) and the Global Lake Ecological Observatory Network (GLEON) with the goal of pushing forward the R&D of core technologies for big data.



