Home  |  NARLabs  | 中文 
Tue, October 23, 2018

| Big Data

From grid computing to cloud computing, the National Center for High-performance Computing (NCHC) has developed technologies that integrate high-bandwidth networks, high-performance computing (HPC), and highly efficient storage facilities that serve scientific research and applications in various fields. In recent years, computing and monitoring data gathered by the NCHC from the fields of earth science, biomedicine, and disaster prevention have topped out at over 200TB worth of data, and are still growing. In order to effectively use this big data, the NCHC develops its own big data processing and analysis technologies, as well as utilizes existing big data processing technologies like Hadoop. The NCHC also combines its specialized knowledge within each field, so that after being processed and analyzed, the big data has a higher value to society as well as to the environment.

Earth Science Observation Knowledgebase
  • Recent technological developments have led to the rapid accumulation of big data. In response to the challenge of high frequency, streaming, and peta-scale data integration, the NCHC has developed the Earth Science Observation Knowledgebase. The Earth Science Observation Knowledgebase stores space, marine, meteorology, hydrology, and seismic observation data.
  • Earth Science Observation Knowledgebase scale of data: At least 155TB of data accumulated annually; 103TB of data accumulated in the first four months.
  • Real-time high-frequency data: Able to process a minimum of 18,000 entries of data per second.

FlyCircuit Database

  • The FlyCircuit Database currently contains about 30,000 high resolution 3D brain neural images of the drosophila fruit fly brain that are combined into a neural circuitry network that researchers can use as a blueprint to further explore how the brain of a fruit fly processes external sensory signals (i.e. how vision, hearing, and smell are transmitted to the central nerve system).
  • This technology will soon be applied to decoding the neural circuitry of the zebra danio fish brain, the common mouse brain, and even the human brain. This technology may prove to someday be one of the keys to unlocking the secrets to the human brain.

Over 500 experts and scholars from more than 36 countries have applied for accounts that allow them access to the NCHC's FlyCircuit Database.

* Website: http://www.flycircuit.tw

The Next-generation Sequencing Data Storage and Analysis Platform
The genome annotation database collects annotation information of more than 80 organisms including human genomes. Genome decoding and sequencing help us to conduct research in epigenetics, and understand the mega-structure of chromosomes and its regulation mechanisms of forming. It can also be used to locate the positions of protein binding sites and gene regulatory elements, as well as in understanding the complete process of gene expression. The transcriptome sequencing database can allow us not only to explore the differences in gene expression profiles between different organs but also to distinguish expressed splicing forms.

Currently, the data stored in the NCHC’s Next Generation Sequencing Data Storage and Analysis Platform exceeds 160TB. By using this platform, more than 138 databases have been used or generated. This data includes outputs from domestic medical research, such as pathogenic bacteria, parasites, viruses, fungi, and cancerous tissues, etc. It will benefit the domestic development of personalized medicine, such as the rapid identification of diseases, and to produce diagnosis reagents.

For application information, please visit: * http://humem.nchc.org.tw/NGS/

The Application of Big Data Technology to Disaster Prevention
  • The NCHC has established a Disaster Management Information Platform (DMIP) that integrates satellite remote sensing and offshore environmental observation data from the National Applied Research Laboratories (NARLabs), and disaster prevention databases from National Science & Technology Center for Disaster Reduction. It integrates environmental and disaster prevention data, simulation and analysis models, visualization and display systems, and management functions. The DMIP thus provides industry, government, and academia the capabilities to bridge, process, display, and develop relevant value-added applications. Hadoop and Storm are used for large data set management and real-time big data analytics in DMIP.
  • The NCHC has developed large-scale monitoring and data recognition technologies that allow users to remotely collect, back up, and analyze large amounts of monitoring data using wireless transmission. These technologies are capable of processing and analyzing data from at least 100,000 instruments and recording and recognizing images from at least 1,000 cameras in real-time, thus, allowing potential areas of flooding, for example, around bridges, to be discovered in a short amount of time. This monitoring technology can be applied to ecology and agriculture as well. For example, this technology can be used to assist in marine ecological environmental research, as well as to analyze the best environment for plant growth in organic agriculture.

Contact Information

Karen Chang 886-3-5776085 ext.215 * d00ycc00@narlabs.org.tw
Hsiao-Pei Tsai 886-3-5776085 ext.317 * 1203021@narlabs.org.tw

Home | Top | Back