LLNL computer scientist David Jefferson and physicist Peter Barnes see the world a little differently. They look at systems like the global economy, ecological environment, and computer networks as complex sequences of interconnected events. Using LLNL’s unique supercomputing resources, Jefferson and Barnes are able to simulate event-based systems with worldwide effects.
Experts in the field of parallel discrete event simulation (PDES), David and Peter are attempting to use high-performance computers to run global-scale models of irregular systems with behaviors that are inherently discontinuous and cannot be described by equations. Such systems include the Internet, vehicular traffic flow, financial markets, and the evolution of viruses. They are working toward running “planetary” scale simulations, which represent billions of interacting objects and are large enough to model systems proportional in size to the population of the entire Earth.
One year ago, David and Peter performed a record-breaking parallel discrete event using the PHOLD benchmark that smashed the benchmark’s previous speed records by a factor of 40. The event was also the fastest and most highly parallel PDES ever executed. For the scaling study, the standard PHOLD benchmark was configured with 251 million interacting entities and executed on 120 racks of LLNL’s Sequoia supercomputer using Rensselaer Polytechnic Institute’s (RPI’s) ROSS simulator.
“We used 8 million parallel processes, simulating more than 250 million objects,” says David. “However, we had enough memory to have represented 100 times as many objects without diminishing computational performance.”
Peter adds that the study proved that the simulator works with extraordinary efficiency and extremely large scale. “We achieved superlinear speedup, meaning that if we doubled the number of cores used to run the model we more than doubled the speed, at least up to the scale of Sequoia, which is quite unusual.”
David and Peter also founded the Extreme Scale Parallel Discrete Event Simulation (XPDES) consortium, which includes collaborators from RPI, the University of Illinois Urbana–Champaign (UIUC), and Georgia Tech, to advance PDES. Over the next three years they will lead an effort funded by the Laboratory Directed Research and Development (LDRD) Program, the Army Research Laboratory, and the HPC Modernization Program to further improve scalable PDES capabilities. They are coordinating development software tools that will be applied to several modeling application areas, including network simulation and agent-based models.
One challenge to developing large-scale PDES simulators and models is that they must be “optimistic,” meaning they have the ability to speculatively execute events and then undo them, if necessary. As part of the LDRD work, Livermore programmers are developing Backstroke—an advanced application of the Laboratory’s ROSE compiler—that automatically generates reverse code for manually developed forward code. Additionally, the XPDES team will port RPI’s ROSS simulator engine to run on top of UIUC’s Charm ++, an object-oriented asynchronous message passing programming platform to enable dynamic load balancing by migrating model objects among the nodes of a supercomputer at runtime. These new tools and others will be combined with improvements to an open-source network simulator for building and executing the largest network simulations ever attempted.
Ultimately, this work will provide the computing community with an open-source ecosystem of software platforms, tools, and applications that are interoperable, portable, and scalable to millions of processes.