This lesson is still being designed and assembled (Pre-Alpha version)

Analyze and visualize climatology

Overview

Teaching: 0 min
Exercises: 0 min
Questions
  • What is a climatology?

  • How to compare model outputs with a climatology?

Objectives
  • Learn about SPARC

  • Learn to compare CMIP6 model data with a climatology

What is a climatology?

A Climatology is a climate data series.

In this lesson, we will use climatological data issued from the Stratosphere-troposphere Processes And their Role in Climate project (SPARC) and in particular the Temperature and Zonal Wind Climatology.

SPARC Climatology

These data sets provide an updated climatology of zonal mean temperatures and winds covering altitudes 0-85 km. They are based on combining data from a variety of sources, and represent the time period 1992-1997 (SPARC Report No. 3, Randel et al 2002). https://www.sparc-climate.org/wp-content/uploads/sites/5/2017/12/SPARC_Report_No3_Dec2002_Climatologies.pdf

The zonal mean temperature climatology is derived using UK Met Office (METO) analyses over 1000-1.5 hPa, combined with Halogen Occultation Experiment (HALOE) temperature climatology over pressures 1.5-0.0046 hPa (~85 km).

The monthly zonal wind climatology is derived from the UARS Reference Atmosphere Project (URAP), combining results from METO analyses with winds the UARS High Resolution Doppler Imager (HRDI). Details from the URAP winds are described in Swinbank and Ortland (2003).

NCAR’s Climate Data Guide (CDG) provides more information (search SPARC) including strengths and weaknesses of assorted data sets.

Plotting SPARC climatology

The SPARC climatology T and U is stored in a file called sparc.nc and can be found on the Jupyterhub.

In the Jupyterhub Terminal:

cd $HOME/shared-tacco-ns1004k/SPARC
ls
sparc.nc

Exercise

What are the horizontal and vertical resolution of the SPARC data? What variables included in the SPARC dataset?

To answer this question:

  • You can use ncdump or CDO

Plotting SPARC climatology using python

Open SPARC climatology netCDF file

import xarray as xr
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import calendar

mpl.rcParams['figure.figsize'] = [8., 6.]

filename = 'shared-tacco-ns1004k/SPARC/sparc.nc'
ds = xr.open_dataset(filename)
ds
<xarray.Dataset>
Dimensions:   (lat: 41, lev_temp: 33, lev_wind: 46, month: 12)
Coordinates:
  * lev_wind  (lev_wind) float32 1000.0 681.29193 ... 3.1622778e-05
  * lat       (lat) float32 -80.0 -76.0 -72.0 -68.0 ... 68.0 72.0 76.0 80.0
  * month     (month) int32 1 2 3 4 5 6 7 8 9 10 11 12
  * lev_temp  (lev_temp) float32 1000.0 681.29193 ... 0.006812923 0.004641587
Data variables:
    WIND      (lev_wind, lat, month) float32 ...
    TEMP      (lev_temp, lat, month) float32 ...
Attributes:
    creation_date:  Tue Feb 21 09:23:03 CET 2017
    creator:        D. Shea, NCAR
    Conventions:    None
    referencese:    \nRandel, W.J. et al., (2004)                            ...
    WWW_data:       http://www.sparc-climate.org/data-center/data-access/refe...
    WWW:            http://www.sparc-climate.org/
    README:         ftp://sparc-ftp1.ceda.ac.uk/sparc/ref_clim/randel/temp_wi...
    title:          SPARC Intercomparison of Middle Atmosphere Climatologies

SPARC climatology: Plot Temperature

#import color maps
%run shared-tacco-ns1004k/scripts/load_cmap.ipynb

# plot for January (month=0)
ds.TEMP.isel(month=0).plot.contourf(cmap=load_cmap('batlow'),
                                   levels=20)

plt.title(calendar.month_name[1])
plt.ylim(plt.ylim()[::-1])
plt.yscale('log')
plt.ylim(top=0.005)
plt.ylim(bottom=1000.)
plt.xlim(left=ds.TEMP.lat.min())
plt.xlim(right=ds.TEMP.lat.max())

SPARC climatology: Plot zonal wind

ds.WIND.isel(month=0).plot.contourf(cmap=load_cmap('vik'),
                                   levels=15)

plt.title(calendar.month_name[1])
plt.ylim(plt.ylim()[::-1])
plt.yscale('log')
plt.ylim(top=5.0e-5)
plt.ylim(bottom=1000.)
plt.xlim(left=ds.TEMP.lat.min())
plt.xlim(right=ds.TEMP.lat.max())

Multiple plots

To generate 12 subplots (one per month) for the zonal wind and temperature climatology, we can use the Python range(start, stop[, step]) function (remember that the range of integers ends at stop - 1):

import xarray as xr
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import calendar

fig = plt.figure(figsize=[18,16])
for month in range(1,13):
    ax = fig.add_subplot(3, 4, month)  # specify number of rows and columns, and axis
    
    ax.set_title(label = calendar.month_name[month])

fig.suptitle(ds.WIND.attrs['long_name'], fontsize=22)
    
# Add shared colorbar:
fig.subplots_adjust(right=0.82, wspace=0.5, hspace=0.3) # adjust subplots so we keep a bit of space on the right for the colorbar    
cbar_ax = fig.add_axes([0.85, 0.15, 0.01, 0.7]) # Specify where to place the colorbar
#fig.colorbar(<handle of your contour plot here>, cax=cbar_ax, label=ds.WIND.attrs['units']) # Add a colorbar to the figure

This allows you to plot, e.g., the zonal wind like this:

Make a multiple plot for the SPARC temperature

Make the same kind of multiple plot but for the temperature instead.

Home-assignment: Visualize and compare CMIP6 model data to the SPARC climatology

Now that we are familiar with the SPARC climatology, we are ready to analyze CMIP6 model data and make comparison with it. This is the goal of the first exercise you will have to fulfill.

To help you we give some hints for a methodology you can follow (please note that there is not one unique way to analyze and plot, and you should feel free to divert from the methodology given), and will assist you during the class.

Methodology

Which data to analyze and why?

You can analyze many variables from the control run to check its validity but at least temperature and zonal wind, as these two variables are the ones contained in the SPARC climatology.

You have a lot of model data at hand. Discuss with your fellow students and tutors which CMIP6 data it makes sense to compare to the SPARC climatology.

Useful Python code

If the data you need are separated into different files, you can load several files at once and concatenate (or combine) them.

# load multiple specified files
path = 'shared-tacco-ns1004k-cmip-betzy/NCAR/CESM2/historical/r10i1p1f1/Amon/ua/gn/latest/'
files = [path+'ua_Amon_CESM2_historical_r10i1p1f1_gn_195001-199912.nc',
         path+'ua_Amon_CESM2_historical_r10i1p1f1_gn_200001-201412.nc']
files
['shared-tacco-ns1004k-cmip-betzy/NCAR/CESM2/historical/r10i1p1f1/Amon/ua/gn/latest/ua_Amon_CESM2_historical_r10i1p1f1_gn_195001-199912.nc',
 'shared-tacco-ns1004k-cmip-betzy/NCAR/CESM2/historical/r10i1p1f1/Amon/ua/gn/latest/ua_Amon_CESM2_historical_r10i1p1f1_gn_200001-201412.nc']

You can also use wildcards (i.e. “*”) for loading many files.

import glob
import os

# load loads of files
path = 'shared-tacco-ns1004k-cmip-betzy/NCAR/CESM2/historical/r10i1p1f1/Amon/ua/gn/latest/'
files =  glob.glob(os.path.join(path, 'ua_Amon_CESM2_historical_r10i1p1f1_gn_*'))
# sort files so they appear by year/month
files.sort()
files
['shared-tacco-ns1004k-cmip-betzy/NCAR/CESM2/historical/r10i1p1f1/Amon/ua/gn/latest/ua_Amon_CESM2_historical_r10i1p1f1_gn_185001-189912.nc',
 'shared-tacco-ns1004k-cmip-betzy/NCAR/CESM2/historical/r10i1p1f1/Amon/ua/gn/latest/ua_Amon_CESM2_historical_r10i1p1f1_gn_190001-194912.nc',
 'shared-tacco-ns1004k-cmip-betzy/NCAR/CESM2/historical/r10i1p1f1/Amon/ua/gn/latest/ua_Amon_CESM2_historical_r10i1p1f1_gn_195001-199912.nc',
 'shared-tacco-ns1004k-cmip-betzy/NCAR/CESM2/historical/r10i1p1f1/Amon/ua/gn/latest/ua_Amon_CESM2_historical_r10i1p1f1_gn_200001-201412.nc']

Now that we have a list of the file names, we can load all files at once into the same xarray dataset using xarray.open_mfdataset(), which knows to concatenate the files in the time dimension.

import xarray as xr

ds = xr.open_mfdataset(files, combine='by_coords')
ds
<xarray.Dataset>
Dimensions:    (lat: 192, lon: 288, nbnd: 2, plev: 19, time: 780)
Coordinates:
  * lon        (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8
  * plev       (plev) float64 1e+05 9.25e+04 8.5e+04 7e+04 ... 1e+03 500.0 100.0
  * lat        (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0
  * time       (time) object 1950-01-15 12:00:00 ... 2014-12-15 12:00:00
Dimensions without coordinates: nbnd
Data variables:
    ua         (time, plev, lat, lon) float32 dask.array<chunksize=(600, 19, 192, 288), meta=np.ndarray>
    time_bnds  (time, nbnd) object dask.array<chunksize=(600, 2), meta=np.ndarray>
    lat_bnds   (time, lat, nbnd) float64 dask.array<chunksize=(600, 192, 2), meta=np.ndarray>
    lon_bnds   (time, lon, nbnd) float64 dask.array<chunksize=(600, 288, 2), meta=np.ndarray>
Attributes:
    Conventions:            CF-1.7 CMIP-6.2
    activity_id:            CMIP
    branch_method:          standard
    branch_time_in_child:   674885.0
    branch_time_in_parent:  306600.0
    case_id:                24
    cesm_casename:          b.e21.BHIST.f09_g17.CMIP6-historical.010
    contact:                cesm_cmip6@ucar.edu
    creation_date:          2019-03-12T05:37:46Z
    data_specs_version:     01.00.29
    experiment:             Simulation of recent past (1850 to 2014). Impose ...
    experiment_id:          historical
    external_variables:     areacella
    forcing_index:          1
    frequency:              mon
    further_info_url:       https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2.h...
    grid:                   native 0.9x1.25 finite volume grid (192x288 latxlon)
    grid_label:             gn
    initialization_index:   1
    institution:            National Center for Atmospheric Research, Climate...
    institution_id:         NCAR
    license:                CMIP6 model data produced by <The National Center...
    mip_era:                CMIP6
    model_doi_url:          https://doi.org/10.5065/D67H1H0V
    nominal_resolution:     100 km
    parent_activity_id:     CMIP
    parent_experiment_id:   piControl
    parent_mip_era:         CMIP6
    parent_source_id:       CESM2
    parent_time_units:      days since 0001-01-01 00:00:00
    parent_variant_label:   r1i1p1f1
    physics_index:          1
    product:                model-output
    realization_index:      10
    realm:                  atmos
    source:                 CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite v...
    source_id:              CESM2
    source_type:            AOGCM BGC
    sub_experiment:         none
    sub_experiment_id:      none
    table_id:               Amon
    tracking_id:            hdl:21.14100/08811120-5c62-417b-97dd-a477ad5820ef
    variable_id:            ua
    variant_info:           CMIP6 20th century experiments (1850-2014) with C...
    variant_label:          r10i1p1f1

We can apply functions to the xarray.Dataset. For example, we can first group the data by month and then average over each of these months.

dy = ds.groupby('time.month').mean('time')
dy
<xarray.Dataset>
Dimensions:   (bnds: 2, lat: 192, lon: 384, month: 12, plev: 19)
Coordinates:
  * lat       (lat) float64 -89.28 -88.36 -87.42 -86.49 ... 87.42 88.36 89.28
  * plev      (plev) float64 1e+05 9.25e+04 8.5e+04 7e+04 ... 1e+03 500.0 100.0
  * lon       (lon) float64 0.0 0.9375 1.875 2.812 ... 356.2 357.2 358.1 359.1
  * month     (month) int64 1 2 3 4 5 6 7 8 9 10 11 12
Dimensions without coordinates: bnds
Data variables:
    lat_bnds  (month, lat, bnds) float64 dask.array<chunksize=(1, 192, 2), meta=np.ndarray>
    lon_bnds  (month, lon, bnds) float64 dask.array<chunksize=(1, 384, 2), meta=np.ndarray>
    ua        (month, plev, lat, lon) float32 dask.array<chunksize=(1, 19, 192, 384), meta=np.ndarray>

Important note

As we are using xarray, computations are deferred, which means that we still haven’t loaded much in memory. Computations will be done when we need to access the data, for example by plotting it.

If you want to save your results in a new netcdf file, you can do this with the to_netcdf xarray method:

dy[['ua']].to_netcdf("ua_Amon_MPI-ESM1-2-HR_historical_r1i1p1f1_gn_201001-201412_clim.nc")

You can then load your saved files back when you need them later, without having to repeat your data processing.

ds = xr.open_dataset("ua_Amon_MPI-ESM1-2-HR_historical_r1i1p1f1_gn_201001-201412_clim.nc")

You are now ready to visualize zonal wind and temperature from CMIP6 model data and start the home-assignment where you will compare this with the SPARC climatology!

Home-assignment

How well does your model represent the SPARC climatology?

To answer this question:

  • Write a Python 3 Jupyter notebook and name it exercise_sparc_vs_Your-MODEL-NAME_YOURNAME.ipynb where you need to replace Your-MODEL-NAME by the model you choose to analyze and YOURNAME by your name!
  • Follow the methodology given in this lesson and compare the results from your model data and the SPARC climatology.
  • Your answer should contain plots of the climatologies, an explanation of what you have plotted and why, and a brief dicsussion of how the climatologies compare.
  • Send it by email to the instructors/teachers. We will discuss your results in the next model practical.

Deadline

Fulfill the first exercise until the next practical on March 7, 2022!

Key Points

  • SPARC

  • climatology