Power-Aware Computing

Historically, supercomputer performance has been limited primarily by the number of available nodes, but future supercomputers will be designed with more nodes than can be simultaneously run at full power. As the limiting factor shifts from nodes to power, supercomputer utilization and performance will increasingly be measured in terms of power utilization, rather than node utilization.

Technical reports on exascale systems overwhelmingly identify power consumption as the single largest design challenge facing hardware and software developers. Current leading-edge machines such as IBM’s BlueGene/Q, which was designed in association with LLNL, are already designed with energy consumption in mind, but simply “scaling up” today’s technology to exascale is not feasible. Today, power costs for the largest petaflop systems are in the range of $5–10M annually. To achieve an exascale system using current technology, the annual power cost to operate the system would be above $2.5B per year with a power load of over a gigawatt—more than many power plants currently produce. Instead, to keep the operating costs of an exascale system within a feasible range, a target of 20 megawatts has been established.

Achieving the power target for exascale systems is a significant research challenge, and a number of technical solutions for constraining power use while maintaining performance must be aggressively explored, not just within the hardware domain, but also for the software and the massive applications that will run on these machines. For instance, data movement is projected to dominate the power budget of future systems, and so application developers are being asked to modify their algorithms. Instead of moving data to the compute processor, the computational work is being moved to the data. Software-oriented power-aware computing solutions currently being investigated at LLNL include modeling application performance at various hierarchical levels for a given a power range, enabling the run-time system to allocate more power to nodes on the critical path, and developing a power-aware resource manager.

Power challenges are inevitably forcing supercomputing experts toward a new design paradigm.

Current software available

Adagio - a power aware runtime.