When it officially came online at the San Diego Supercomputer Center (SDSC) in early January 2012, Gordon was instantly impressive. In one demonstration, it sustained more than 35 million input/output operations per second–then, a world record.
Input/output operations are an important measure for data intensive computing, indicating the ability of a storage system to quickly communicate between an information processing system, such as a computer, and the outside world. Input/output operations specify how fast a system can retrieve randomly organized data common in large datasets and process it through data mining applications.
The supercomputer’s record-breaking feat wasn’t a surprise; after all, Gordon is named after a comic strip superhero, Flash Gordon.
Gordon’s new and unique architecture employs massive amounts of the type of flash memory common in cell phones and laptops–hence its name. The system is used by scientists whose research requires the mining, searching and/or creating of large databases for immediate or later use, including mapping genomes for applications in personalized medicine and examining computer automation of stock trading by investment firms on Wall Street.
Commissioned by the National Science Foundation (NSF) in 2009 for $20 million, Gordon is part of NSF’s Extreme Science and Engineering Discovery Environment, or XSEDE program, a nationwide partnership comprising 16 high-performance computers and high-end visualization and data analysis resources.
“Gordon is a unique machine in NSF’s Advanced Cyberinfrastructure/XSEDE portfolio,” said Barry Schneider, NSF program director for advanced cyberinfrastructure. “It was designed to handle scientific problems involving the manipulation of very large data. It is differentiated from most other resources we support in having a large solid-state memory, 4 GB per core, and the capability of simulating a very large shared memory system with software.”
Last month, a team of researchers from SDSC, the United States and the Institute Pasteur in France reported in the journal Genes, Brain and Behavior that they used Gordon to devise a novel way to describe a time-dependent gene-expression process in the brain that can be used to guide the development of treatments for mental disorders such as autism-spectrum disorders and schizophrenia.
The researchers identified the hierarchical tree of coherent gene groups and transcription-factor networks that determine the patterns of genes expressed during brain development. They found that some “master transcription factors” at the top level of the hierarchy regulated the expression of a significant number of gene groups.
The scientists’ findings can be used for selection of transcription factors that could be targeted in the treatment of specific mental disorders.
“We live in the unique time when huge amounts of data related to genes, DNA, RNA, proteins, and other biological objects have been extracted and stored,” said lead author Igor Tsigelny, a research scientist with SDSC as well as with UC San Diego’s Moores Cancer Center and its Department of Neurosciences.
“I can compare this time to a situation when the iron ore would be extracted from the soil and stored as piles on the ground. All we need is to transform the data to knowledge, as ore to steel. Only the supercomputers and people who know what to do with them will make such a transformation possible,” he said.
This research is one of a number of high-value projects being conducted at SDSC with Gordon.
National Science Foundation