Department of Energy Secretary Rick Perry wrote last May that “the future is in supercomputers,” but until recently, only a handful of agencies have been able to tap into that kind of power. Traditionally, high performance computing (HPC, or supercomputing) has required significant capital investment — as much as $400 million to $600 million for large-scale supercomputing infrastructures and operating expenses. It also called for small armies of scientists and engineers skilled in HPC application development. Precious few agencies had these resources and technical expertise.
But times have changed, according to Ian Lee, open source lead at Lawrence Livermore National Laboratory. “We’ve been doing open source on big Unix systems for more than 20 years. Back then, if we produced open source software for our supercomputers, we were the only ones who could use that software,” he said. “Now, the software can be ported out and mainstreamed, and it’s a lot easier to make use of supercomputing in other places.”
Open source and the hybrid cloud: de facto technologies for HPC
Together, open source and the hybrid cloud are bringing HPC within reach of organizations that may not possess the budget or supercomputing expertise inherent in larger agencies. Now, even small agencies can harness HPC to elastically process large amounts of information in short amounts of time. They can use the combined power of open source and hybrid cloud environments — which combine on-premise and public cloud services — to open the door for new and exciting advancements in data science.
This evolution started when Linux became the de facto software that powers all of the TOP500 supercomputers. Meanwhile, organizations began to look beyond Unix machines and turn to the cloud. Today, open-source software is behind Summit and Sierra, the world’s two most powerful supercomputers, and community initiatives like OpenHPC exist to support open source’s contributions to supercomputing. HPC has become deeply intertwined with open-source technologies like OpenStack, which provides a flexible and highly scalable infrastructure for supercomputing.
Open-source technologies have also informed and been used in cloud technology stacks, allowing CPU resources to be scaled out. In fact, HPC workloads no longer need to run on bare-metal hardware. These workloads can often be deployed in the cloud using software containers that are easier to provision, access, orchestrate and scale as needed. By scaling out the compute assets available, agencies can dynamically increase resources as needed, which can save taxpayer dollars.
Scalable open technologies have also flowed the other way, gaining traction and adoption within the established supercomputing facilities. Linux container technology, in particular, has seen interest and adoption by developers seeking to run software applications on large HPC systems.
We’re seeing this evolution take shape before our eyes and in different forms. For example, there is support for high-performance graphics processing units in Kubernetes, an open-source container orchestration tool. Kubernetes enables containerized applications to be managed and scheduled with on-demand access to the same types of compute hardware routinely provided for traditional HPC workloads. These GPU resources are used extensively in machine learning and artificial intelligence. Open-source machine learning frameworks like TensorFlowenable high-performance numerical computation. They can be easily deployed across a variety of platforms, including GPUs, and are readily containerized.
Open source has always been core to HPC efforts. Now, with the potential of hybrid cloud and Linux containers, the use case for HPC is wider than ever before.
Opening new possibilities
Consider the possibilities for federal agencies that, until now, did not have access to HPC. The Department of Housing and Urban Development could use HPC to advance intelligence that could help the agency make better decisions pertaining to people’s housing needs. The National Institutes of Health could test different medicines’ effects on the human heart and build upon the cardioid application developed at Lawrence Livermore National Laboratory and available now as open-source software. U.S. Strategic Command could more accurately project the fallout of satellite collisions, akin to work being done at Lawrence Livermore.
“Many agencies are also using HPC and the cloud for cyber defense,” Lee said. “For example, the Department of Homeland Security is using the cloud to do vulnerability scanning across government agencies. Rather than having a bunch of dedicated resources that are only used once a week, they’re using the cloud to spin up resources, perform scans and shut those resources down,” he said. “This approach saves taxpayer dollars by only running the compute resources when they are using them, and not when they aren’t.”
Agencies no longer must make capital outlays for HPC hardware that can depreciate quickly. Instead, they can take advantage of the greater flexibility provided by hybrid cloud, including on-demand elasticity. Like DHS, they can spin up HPC-scale resources whenever the need arises, perform parallel experiments, spin them back down when finished, and not spend money whenever there’s a lull in research. This can help to lower the cost of failure, which aligns well with the “fail fast to succeed sooner” credo of open source.
Using a hybrid cloud strategy, agencies can experiment quickly on a small scale with lower risk, and then bring those workloads on-premises to scale out the solution. Indeed, a hybrid cloud is appropriate for this type of work, as some workloads can perform better and more cost-effectively using on-premises hardware. The key is to build HPC solutions on an open substrate that enables these workloads to go from physical to virtual to cloud and back again as business needs and technologies change.
As Secretary Perry has written, the future is in supercomputers. But that future can and should be open to everyone. The combination of open source and the hybrid cloud has the potential to give all federal agencies access to HPC resources.