Bringing the users to the data: a new climate and weather information system 

24 May 2023

What is new in Destination Earth compared to existing information systems? DestinE aims at making a breakthrough in many aspects, starting from the hardware and software architecture; maximising the computing capacity and the performance of the models and simulations. Another key objective for DestinE is to improve data access and the information delivery process itself, exploring new forms of interactivity.    

Given the unprecedented data volumes that the DestinE digital twins will generate, DestinE will ultimately move away from bringing the data to the users, as most current climate and weather information systems do, to bringing the users to the data. Instead of downloading large datasets to then extract the relevant information, users will access the data generated by the Digital Twins created by the European Centre for Medium-Range Weather Forecasts (ECMWF) and other entities, managed by the European Organisation for the Exploitation of Meteorological Satellites’ (EUMETSAT) data lake through the Core Service Platform, created by the European Space Agency (ESA), and going directly to where the data is. This shift in the information delivery process is not expected to take place during the first phases of the initiative, which will be fully focused on implementing the ambitious technological infrastructure and science behind DestinE. 

Such data access concept is already in place on platforms such as the Climate and Atmosphere Data Stores of the Copernicus Climate Change (C3S) and Atmosphere Monitoring Services (CAMS), operated by ECMWF. DestinE plans to do it at an entirely different scale in terms of data volumes and with a higher level of interactivity between the user and the way data is being produced. 

How to handle 1 petabyte of data per day 

Destination Earth is expected to produce an exceptional 1 petabyte of data per day. This will incorporate  a permanent hot data storage located near each of the pre-exascale EuroHPC installations, which make up to 30 Petabytes of DT data available for user and machine learning applications. By comparison, CAMS and C3S, which are already managing very large data volumes, deliver about 130 terabytes daily, roughly ten times less. In order to make this kind of data volume manageable, ECMWF will provide DestinE with the Digital Twin Engine (DTE), an efficient software infrastructure that enables to not only run the Digital Twin simulations, but also process, access and handle data, among other key features.

The DTE can also help to run applications for specific impact sectors while the Earth system simulations are still running, so that users extract only the information bits they need for their specific purposes (policymakers won’t need the same data than climate scientists or renewable energy plant operators).  

Such large volumes of data are too big to be continually stored, instead, users will be “listening” to the specific data streams they’re interested in and will capture and collect the slices of information they need. This approach will be used in particular in the Climate Change Adaptation Digital Twin. 

Unidirectional vs user-centric information systems 

Most current climate information systems work on a “one-way” information flow paradigm, with a very established path of information that starts with the simulations generated by a highly specialised actor. These generate data providing estimations of possible climate evolutions that are digested and turned into relevant information for users in a top-down scheme, from the science level to the service user level, hence the unidirectional image. In this logic the user only comes in at the end. Depending on their expertise, users can adapt that information to their specific needs, for example, being an energy company, a policymaker, an infrastructure organization, or an insurance company (see left-hand side of the figure). 

From a unidirectional to a more interactive information system.

In order to create a user-centric information system, this unidirectional logic needs to be reshuffled, as shown in the right-hand side of the figure. The information should be generated by the user request and the user needs, for example a policymaker aiming to know, for example:  

  • What is the risk of coastal flooding in the Netherlands in 2050?  
  • What is the most cost-effective adaptation scenario for those risks? 

With the current “one-way” information distribution structure a user can only hope that the relevant answers have been delivered by the original datasets. They will also need to retrieve information from the flow of data generated over time, then interpret it for their purpose and extract the insights needed for their specific decision making.  

The user centric information system that DestinE is aiming for would make available all of the information generated by the simulation, but with the added value of providing access to different abstraction levels along the data processing chain. This system requires a hierarchy of data and tool layers that can be accessed and configured by users. The Copernicus Services at ECMWF have already gone a long way in this direction, building a co-design approach between providers and users, but so far in a static way, with a quite long feedback loop between user requirements collection to service delivery. DestinE aims to achieve a more interactive process.  

Under this paradigm the policymaker asking the questions above would trigger a request that will be addressed by cherry picking on the data generated by the initial simulations and being able to interact with the data as it is being produced. The user would have visibility of the different abstraction and expertise levels (social scientists, Earth system scientists, data scientists, information providers) with a fully traceable model, data and information hierarchy across the board. The user could also play through different what/if scenarios at those levels by recreating some of the intermediate steps. 

It should be noted that beyond the general vision for user-centric data infrastructures, the “one way” information systems will still be relevant where interactivity is not required (for example for supporting and enhancing the provision of operational services like those provided by ECWMF, or by national and European services like Copernicus).  

The final shape of this information system will depend on many factors and will gradually evolve during the successive implementation phases (2024/ 2027/ 2030). Important is that this new system will be the product of an interactive co-design process where the tools and data management system are the product of a continuous dialogue between partners representing the expertise level shown in the above figure as DestinE is evolving. 

Destination Earth is a European Union funded initiative launched in 2022, with the aim to build a digital replica of the Earth system by 2030. The initiative is being jointly implemented by three entrusted entities: the European Centre for Medium-Range Weather Forecasts (ECMWF) responsible for the creation of the first two ‘digital twins’ and the ‘Digital Twin Engine’, the European Space Agency (ESA) responsible for building the ‘Core Service Platform’, and the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT), responsible for the creation of the ‘Data Lake’.   

More information about Destination Earth is on the EU Commission website