Direkt zum Inhalt | Direkt zur Navigation

Personal tools

You are here: Home » Data » Support Services on Data and Metadata » Analysis Platforms » Demo: run server-side data-near multimodel comparisons

Demo: run server-side data-near multimodel comparisons

last modified May 14, 2020 10:17 AM

A dedicated team is creating training material to assist the users of model data in running their analyses in world-class supercomputers.

Multimodel comparison of CMIP6 SSP2-4.5 scenario

We show how to plot the global mean annual mean near-surface air temperature for some CMIP6 models as an example of a data-near server-side analysis.

We concatenate historical data (that is, model results from 1850 to 2014 using the best estimates for anthropogenic and natural forcing) with the model results for the Shared Socioeconomic Pathway SSP2-4.5 based scenario (which corresponds to the growth in radiative forcing reached by 2100, in this case, 4.5 W/m2 or ~650 ppm CO2 equivalent).

The code that generated the figure above is in a Jupyter notebook called "CMIP6_multimodel_example.ipynb". We have a repository of test cases where you can find and download this Jupyter notebook. In this demo, the Jupyter notebook runs in one of the IS-ENES3 world-class supercomputer called Mistral at the German Climate Computing Center (DKRZ), which has direct access to more than 3.3 petabytes of CMIP6 model data results (more info on the data pool here).

Successful applicants to the Analysis Platforms service that chose DKRZ as host will get an account in Mistral (follow the steps here to request your account once we let you know by email that your proposal has been accepted). After that, users need to join a specific group. This is because many users from different projects access the supercomputer and the resources must be allocated for us. Our CPU hours will be counted there and also there is memory allocated to storage the results. The group for IPCC related data analysis activities in the IPCC DDC Virtual Workspace and the IS-ENES3 related data analysis activities, as the Analysis Platforms, is bk1088 (as shown in the animation below).

Users can connect to the Mistral console via ssh (ssh <user-account>@mistral.dkrz.de, more login info here) and run the Jupyter notebook there (which gives you the freedom to create your own environments) but in this example we will show how to run the Jupyter notebook within the DKRZ Jupyterhub (which  already includes the common packages for climate multimodel comparsions).

Once the notebook "CMIP6_multimodel_example.ipynb" was downloaded from the repository to your local folder, that is, in your computer, it must be copied to your home directory in Mistral (if you are logged and in home, just write in shell "cp your_local_path/CMIP6_multimodel_example.ipynb ."), where "your_local_path" stands for where you saved the notebook in your computer. The notebook will then appear in the list of available folders and files when the Jupyterhub server is open.

Click on the figure below to see how to log in, choose a job profile and start the server (which takes a few seconds):


We use Python 3 Pandas (the popular data analysis package focused on labelled tabular data) and Xarray (the Pandas generalization for n-dimensional arrays, particularly tailored to working with netCDF files) to process the data, together with the cdo (Climate Data Operators) package to concatenate the model results with their correspondent historical results. Click on the figure to see how to import the packages and find the data paths in the data pool:

 We then identify the historical and scenario match:


And then we load the data directly from the data pool:


Finally, we calculate the means and plot the results. We choose to highlight the MPI-ESM1-2-LR and IPSL-CM6A-LR results:


See in the Jupyter notebook that a .pdf with the figure is also created in a folder in Mistral home called "/plots/CMPI6_overview". You can download the plot to your local computer with "cp  SSP2-4.5.pdf your_local_path".