It’s amazing what problems can be solved when you get everyone together.
Sasha Ames
artist rendering of people collaborating around the world with clouds and binary numbers

ESGF Conference Caps a Productive Year

Tuesday, February 12, 2019

Members of the Earth System Grid Federation (ESGF) gathered in Washington, DC, on December 3–7 for the 8th annual conference. The event packed 40 presentations, several plenary sessions, a poster session, guest speakers, an awards ceremony, and an executive committee meeting into the week. The Lawrence Livermore National Laboratory (LLNL) delegation comprised 19 staff from the Computation and Physical and Life Sciences (PLS) directorates.

Born at LLNL to address “big data” challenges in Earth system research, ESGF is an international collaboration of computer scientists, data scientists, and climate researchers. Principal funding comes from the Department of Energy’s (DOE’s) Office of Biological and Environmental Research (BER) with additional support from several foreign research centers.

The federation houses an enormous database of global observational and simulation data—more than 5 petabytes—and manages the high performance computing (HPC) hardware and software infrastructure necessary for scientific climate research. In the nearly two decades since its launch, ESGF has grown to serve 25,000 users on 6 continents. LLNL’s involvement includes contributing climate data, managing a server enclave, hosting the ESGF website, leading the executive committee, and organizing events like the conference.

In his welcome address at the conference, BER program manager Justin Hnilo noted that ESGF’s capabilities are recognized at the highest level within the DOE. “ESGF helps us gauge strategic investments for the DOE mission,” he stated, emphasizing the federation’s value to basic science research in the interest of energy security. For example, Hnilo said, “If we know trends in sea level change along the U.S. eastern seaboard, we can ensure appropriate resources are in place to prepare for and mitigate those effects.”

Figure 1. ESGF conference attendees traveled from multiple countries to celebrate 2018 accomplishments and plan 2019 development. (Photo by Angela Jefferson. Click to enlarge.)

Annual Ritual

ESGF working groups conduct daily operations remotely all over the world. Although most teams meet online regularly, face-to-face interaction is incredibly important for cementing relationships and evaluating progress. “It’s amazing what problems can be solved when you get everyone together,” said Sasha Ames, ESGF project lead for LLNL’s Analytics and Informatics Management Systems (AIMS). “The conference is our annual ritual, providing opportunities for focused discussions.”

The conference was “intense and valuable” for LLNL’s Ghaleb Abdulla, who began working with ESGF only a few weeks before the event. “I got to know the team and they got to know me. Attending the executive committee meeting gave me an idea of the challenges and the amount of work ahead of me,” he stated.

Michael Lautenschlager, director of data management at DKRZ (the German Climate Computing Center) and ESGF executive committee member, added, “The conference allows for much more intensive exchange of ideas than telephone or video meetings. It’s important that we structure the conference to maximize attendees’ time.”

In addition to an LLNL status update, attendees heard from research partners at NASA (U.S. National Aeronautics and Space Administration); IS-ENES (Infrastructure for the European Network for Earth System Modeling, consisting of 22 research centers); CRIM (Canada’s Centre de Recherche Informatique de Montréal); and NCI (Australia’s National Computational Infrastructure).

CMIP6 Milestones

ESGF’s 2017 conference focused mainly on preparation for the release of the Coupled Model Intercomparison Project Phase 6 (CMIP6) dataset. Hnilo stated, “CMIP6 serves a large international research community, including the Intergovernmental Panel on Climate Change. ESGF provides the primary mode for data dissemination of these model intercomparisons.”

CMIP6 presented a significant test of ESGF’s infrastructure for the expected 20 petabytes of model output. Multiple components across the service stack needed to function under this level of stress—services such as publishing, search, download, and replication (i.e., moving data from one ESGF center to another).

Incremental data challenges in 2018 verified the integrity and robustness of ESGF infrastructure prior to the midsummer CMIP6 launch. Development teams reported on these readiness activities during the 2018 conference. “Overall, we stuck to the appropriate priorities,” summarized Ames. “We accomplished the most important tasks.”

As part of CMIP6 readiness, the input4MIPs initiative provided a key improvement in “forcing” dataset consistency when comparing with previous CMIP phases. Helmed by LLNL scientists, input4MIPs collects, archives, and documents climate datasets to support the coordinated modeling activities. ESGF hosts input4MIPs data alongside CMIP datasets, enabling climate researchers to evaluate climate models with uniform standards under the same conditions. LLNL’s Paul Durack recently won the World Climate Research Programme Data Prize for his leadership of input4MIPs.

Similarly, the obs4MIPs initiative began planning for CMIP6 in 2016. Co-led by LLNL and NASA and hosted on ESGF servers, this project established a database used by the CMIP modeling community for comparing satellite observations with climate model predictions. In 2018, the obs4MIPs team implemented several enhancements in data indicators and integration along with a prototype of color-coded quality indicators.

Other Major Developments

One crucial achievement during 2018 was the beta version 3.0 of the ESGF software stack installer, released during the conference. The installation working group closed more than 200 issues for this version, and a conference poster detailed the team’s efforts to stand up a Jenkins automation server for validating changes to installer code. The Python-based beta addresses several long-standing problems such as a lack of error handling, lack of extensibility, and a complicated installation process.

Another progress report came from the identity and access working group, who demoed OAuth single sign-on integration to increase security and ensure proper user permissions. This work involved use cases for accessing data without authentication, using OAuth credentials with other platforms, handling different versions of OAuth, embedding the OAuth certificate in wget scripts, and confirming the OAuth access token workflow.

Most working groups made some headway in “containerizing” their areas of the ESGF infrastructure. For instance, a prototype of the search service was implemented with Docker and Kubernetes on the Solr Cloud. Similarly, the compute working team demoed a new container-based design for better scalability of server-side distributed computing.

Containerized architecture is compatible with Cloud deployment, which is another key effort under way among many ESGF teams. In addition to the search service prototype, ESGF’s research partners at NASA plan to move their data into the Amazon Cloud for easier model intercomparisons with a containerized stack. Other research centers are also considering Cloud computing to support their ESGF nodes—namely, leveraging on-demand capabilities with some simplification of maintenance tasks—although European partners face unique challenges without native Cloud providers on their continent.

Figure 2. The ESGF conference poster session generated a great deal of interest and discussion among attendees. (Photo by Holly Auten. Click to enlarge.)

Looking Ahead

In 2019, ESGF working teams plan to complete the rollout of the installer and OAuth access; evaluate a trial run of the new search service; develop additional containerized packages; upgrade the user interface with standalone search; and stabilize CMIP6 operations. Additionally, compute node challenges are slated for spring and summer to finish containerizing the web processing service API. These expanded capabilities will help ensure sustainability for eventual requirements from future CMIP datasets, user growth, exascale computing, machine learning technologies, and more.

Meanwhile, certain topics require further deliberation: infrastructure maintenance and opportunities with new scientific domains. Maintaining the ESGF infrastructure is just as important as new development—but not as glamorous. Accordingly, Lautenschlager noted, “Funding to cover ESGF maintenance can be challenging to obtain.” He suggested building off the success of the CMIP6 data challenges by instituting data management challenges to close operational gaps, such as ensuring sustainability of Cloud-based services and improving user support workflows.

Abdulla joined ESGF to share principal investigator duties with Dean Williams, who has chaired the federation since its inception. Abdulla’s previous experience with LLNL’s Cancer Registry of Norway collaboration and HPC energy efficiency projects gives him the proverbial 10,000-foot view of ESGF’s capabilities. “ESGF’s principal goal is building an infrastructure for capturing, integrating, maintaining, and delivering data and knowledge to climate scientists. It’s similar to the goals of other projects in other scientific domains,” he explained.

This perspective is key to ESGF’s growth toward generalizing its software and data management tools. Abdulla continued, “Thanks to Dean and his team, ESGF is positioned to help solve data federation challenges for other science domains, such as biology or medicine.”

As 2019 begins, the executive committee plans to schedule additional face-to-face meetings and maximize ESGF’s presence at relevant international conferences. Reflecting on the federation’s leaps in progress over the years, Ames recalled, “When I started working with ESGF five years ago, we didn’t even have a compute API. Today we’re exploring emerging technologies across the globe.”

Learn More