Skip to content

[Feature]: Consider adding vertical regridding feature to reconstruct pressure from hybrid #446

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
tomvothecoder opened this issue Apr 6, 2023 · 23 comments · May be fixed by #756
Open
Assignees
Labels
project: seats-fy25 A goal for SEATS FY25 type: enhancement New enhancement request

Comments

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Apr 6, 2023

Is your feature request related to a problem?

Related to #45 and #388.

Discussed in #440

Originally posted by tomvothecoder March 27, 2023
Hi @xCDAT/core-developers or anybody in the xCDAT community. Have any of you come across a package that implements an xarray-based function similar to cdutil.vertical.reconstructPressureFromHybrid? I am working on refactoring e3sm_diags and e3sm_to_cmip to use xarray/xCDAT and there are references to this function that I need to replace. Thanks!

Docstring:

def reconstructPressureFromHybrid(ps, A, B, Po):
    """
    Reconstruct the Pressure field on sigma levels, from the surface pressure
    :param Ps: Surface pressure
    :param A: Hybrid Conversion Coefficient, such as: p=B.ps+A.Po.
    :param B: Hybrid Conversion Coefficient, such as: p=B.ps+A.Po.
    :param Po: Hybrid Conversion Coefficient, such as: p=B.ps+A.Po
    :param Ps: surface pressure
    .. note::
        A and B are 1d sigma levels.
        Po and Ps must have same units.
    :returns: Pressure field, such as P=B*Ps+A*Po.
    :Example:
        .. doctest:: vertical_reconstructPressureFromHybrid
            >>> P=reconstructPressureFromHybrid(ps,A,B,Po)
    """
	...

Examples Usages:

Describe the solution you'd like

First, check of any other xarray-based package includes this feature.

Describe alternatives you've considered

No response

Additional context

No response

@tomvothecoder tomvothecoder added the type: enhancement New enhancement request label Apr 6, 2023
@tomvothecoder
Copy link
Collaborator Author

Hey @jasonb5, can you follow the "Describe the solution you'd like" and see what we should do?
If no API exists in another xarray-based package, I'm hoping to have this available by the next release around end of May.

e3sm_to_cmip and e3sm_diags are being refactored to use xarray/xcdat and both use cdutil.vertical.reconstructPressureFromHybrid() which blocks progress for these efforts.

@jasonb5
Copy link
Collaborator

jasonb5 commented Apr 7, 2023

@tomvothecoder I'll check this out and see if I can find anything existing otherwise I'll go off the original plan I had for this in #388.

@tomvothecoder
Copy link
Collaborator Author

Thanks @jasonb5!

@dcherian
Copy link

dcherian commented Apr 15, 2023

I would happily merge this in cf-xarray. We already support a few ocean parametric coordinates. The code would go here https://github.com/xarray-contrib/cf-xarray/blob/c36a3687b3820176c394580f86a67ca53b9cadbd/cf_xarray/accessor.py#L2507

@tomvothecoder
Copy link
Collaborator Author

tomvothecoder commented Apr 17, 2023

Hey @dcherian, yes @jasonb5 came across that feature in cf-xarray! He mentioned it might be a good opportunity to make a PR to cf-xarray to support decoding other parametric vertical coordinates.

Ideally, we'd like to have this feature out relatively soon so we can start refactoring some other packages with it.
If cf-xarray has a quick turnaround time for releases, we can definitely contribute with a PR. Alternatively, we can implement this feature in xCDAT in the interim and then port it over to cf-xarray at a later date.

@dcherian
Copy link

If cf-xarray has a quick turnaround time for releases

Yes, at the moment I just release when there's some nice new feature.

@pochedls
Copy link
Collaborator

pochedls commented Aug 3, 2023

In testing xcdat v0.6.0, I wanted to be able to create the pressure coordinate from hybrid (generically across a bunch of models). I ended up creating the following function, which takes the embedded formula (e.g., ds.lev.formula) and converts it into a statement that can be executed by xarray (e.g., interprete_formula(ds.lev.formula) yields p = ds['a']*ds['p0']+ds['b']*ds['ps'] where ds.lev.formula = p = a*p0 + b*ps). So you can then do:

f = ds.plev.formula
fstring = interprete_formula(f)
exec(f)
print(p)

<xarray.DataArray (lev: 91, time: 1980, lat: 128, lon: 256)>
...
Coordinates:

  • lev (lev) float64 0.9988 0.9959 0.992 ... 5.61e-05 2.951e-05 9.869e-06
  • lat (lat) float64 -88.93 -87.54 -86.14 -84.74 ... 86.14 87.54 88.93
  • lon (lon) float64 0.0 1.406 2.812 4.219 ... 354.4 355.8 357.2 358.6
  • time (time) object 1850-01-16 12:00:00 ... 2014-12-16 12:00:00

This is a clunky function (though it so far appears to be working on CMIP data...). I am absolutely sure you developers can come up with a better method for interpreting the hybrid formula on the fly, but I thought I'd put it here in case it is useful. I also recognize using exec/eval is not a best practice...maybe this can be avoided somehow (I have ideas if anyone ends up working on this).

def interprete_formula(f):
    operator_chars = ['*', '(', ')', '+', '-', '/']
    # first find/remove all parenthesis that are specifying axes
    # p(n,k,j,i) = ap(k) + b(k)*ps(n,j,i) --> p = ap + b*ps
    # get formula indices corresponding to a left hand parenthesis (
    pinds = [i for i in range(len(f)) if f[i] == '(']
    # initialize list to replace sections of formula (e.g., '(i,j,k)' --> '')
    replace = []
    # loop over left hand parenthesis instances
    for p in pinds:
        pref = ''  # initialize parenthesis string
        stopsearch = False  # set stop flag to false
        # loop from left hand parentheis to end of formula
        for i in range(p, len(f)):
            # if stopsearch, continue
            if stopsearch:
                continue
            # get indexed character
            c = f[i]
            # check for close parenthesis )
            # if closed, stopsearch
            if c == ')':
                pref += c # add character to parenthesis string
                stopsearch = True
            else:
                pref += c # add character to parenthesis string
        # check for arithmetic operators inside parenthesis string
        ocs = [c for c in pref[1:-1] if c in operator_chars]
        # if no arithmetic operators, queue parenthesis string for removal
        if len(ocs) == 0:
            replace.append(pref)
    # remove all parenthesis strings with no arithmetic operators
    for r in replace:
        f = f.replace(r, '')

    # further process formula
    fout = ''  # initialize blank formula
    vid = ''  # initialize blank variable
    lhs = f.split('=')[0] # get lhs
    f = f.split('=')[1] # get rhs
    # loop over each char
    for s in f:
        # ignore blank spaces
        if s == ' ':
            continue
        # if arithmetic operator, need to add to formula
        if s in operator_chars:
            # if a variable is not empty, write it out first
            if len(vid) > 0:
                fout = fout + 'ds["' + vid + '"]'
                vid = ''  # reset variable accumulator
            # write out arithmetic operator
            fout = fout + s
        else:
            # char must be part of a variable - add to variable accumulator
            vid += s
    # finished looping over all characters
    # write out variable to formula
    fout = fout + 'ds["' + vid + '"]'
    # add lhs
    fout = lhs + ' = ' + fout
    # return formula
    return fout

@jasonb5
Copy link
Collaborator

jasonb5 commented Jul 18, 2024

@pochedls @tomvothecoder Created PR in the cf_xarray repo xarray-contrib/cf-xarray#528.

@pochedls
Copy link
Collaborator

I think I have this started correctly. @jasonb5 – would decode_vertical_coords have worked for atmosphere_hybrid_sigma_pressure_coordinate before this PR (it is working now, which I assume is because of your PR)?

I pulled your PR (in my normal xcdat environment) import cf-xarray and cf.__path__ shows that I'm pointing to your PR). I then import xcdat (which I believe retains the updated cf-xarray import...).

Syntax for testers (i.e., note to self):

fn = '/p/css03/esgf_publish/CMIP6/CMIP/NASA-GISS/GISS-E3-G/amip/r1i1p3f1/CFmon/ta/gr/v20220707/ta_CFmon_GISS-E3-G_amip_r1i1p3f1_gr_197801-201412.nc'
ds = xc.open_mfdataset(fn)
ds.cf.decode_vertical_coords(outnames={'lev': 'plev'})
ds.plev

@pochedls
Copy link
Collaborator

Summary

I made some progress on this. The main finding is that models with the formula p = ap + b*ps fail to decode (sometimes with an error and sometimes silently). I think this results because under CF conventions, vertical coordinates with the standard_name : atmosphere_hybrid_sigma_pressure_coordinate have two acceptable formulas (with different fields needed): p = ap + b*ps and p = a*p0 + b*ps.

I've broken the problem into two main chunks ("Decode Error" and "Silent Failures"):

Decode Error

The following models produce an error (Required terms p0 are absent in the dataset.): IPSL* / Can* / MPI* / EC-Earth3-AerChem. These models have standard_name : atmosphere_hybrid_sigma_pressure_coordinate with formula p = ap + b*ps.

UKESM* and HadGEM3-GC31-LL produce KeyError: 'standard_name' and have standard_name : atmosphere_hybrid_height_coordinate with formula z = a + b*orog.

Silent failures

A number of models silently fail (i.e., ds.cf.decode_vertical_coords(outnames={'lev': 'plev'}) executes, but does not produce a plev DataArray): CNRM-CM6-1, CNRM-ESM2-1, GFDL-CM4, IPSL-CM6A-MR1. These datasets have standard_name : atmosphere_hybrid_sigma_pressure_coordinate and formula p = ap + b*ps. They don't have p0 in the dataset, but for some reason don't produce the error as above.

CESM2-WACCM and CESM2 also silently fail with standard_name : alevel (and no listed formula).

IPSL-CM6A-LR also silently fails. It has vertical coordinate presnivs with standard name : 'Vertical levels' and no formula.

Dataset Listings

Datasets with no apparent problem

CNRM-CM5: /p/css03/cmip5_css01/data/cmip5/output1/CNRM-CERFACS/CNRM-CM5/amip/mon/atmos/cfMon/r1i1p1/v20111006/ta/
bcc-csm1-1-m: /p/css03/cmip5_css01/data/cmip5/output1/BCC/bcc-csm1-1-m/amip/mon/atmos/cfMon/r1i1p1/v20120910/ta/
GFDL-CM3: /p/css03/esgf_publish/cmip5/output1/NOAA-GFDL/GFDL-CM3/amip/mon/atmos/cfMon/r1i1p1/v20110601/ta/
CESM1-CAM5: /p/css03/cmip5_css01/data/cmip5/output1/NSF-DOE-NCAR/CESM1-CAM5/amip/mon/atmos/cfMon/r2i1p1/v20121129/ta/
HadGEM2-A: /p/css03/cmip5_css02/data/cmip5/output1/MOHC/HadGEM2-A/amip/mon/atmos/cfMon/r1i1p1/ta/1/
inmcm4: /p/css03/cmip5_css02/data/cmip5/output1/INM/inmcm4/amip/mon/atmos/cfMon/r1i1p1/ta/1/
MIROC5: /p/css03/cmip5_css02/data/cmip5/output1/MIROC/MIROC5/amip/mon/atmos/cfMon/r1i1p1/ta/1/
MRI-CGCM3: /p/css03/esgf_publish/cmip5/output1/MRI/MRI-CGCM3/amip/mon/atmos/cfMon/r1i1p1/v20131011/ta/
CCSM4: /p/css03/esgf_publish/cmip5/gdo2/cmip5/output1/NCAR/CCSM4/amip/mon/atmos/cfMon/r7i1p1/v20121031/ta/
INM-CM5-0: /p/css03/esgf_publish/CMIP6/CMIP/INM/INM-CM5-0/amip/r1i1p1f1/CFmon/ta/gr1/v20190610/
INM-CM4-8: /p/css03/esgf_publish/CMIP6/CMIP/INM/INM-CM4-8/amip/r1i1p1f1/CFmon/ta/gr1/v20190528/
TaiESM1: /p/css03/esgf_publish/CMIP6/CMIP/AS-RCEC/TaiESM1/amip/r1i1p1f1/CFmon/ta/gn/v20200817/
MRI-ESM2-0: /p/css03/esgf_publish/CMIP6/CMIP/MRI/MRI-ESM2-0/amip/r2i1p1f1/CFmon/ta/gn/v20190722/
MIROC-ES2L: /p/css03/esgf_publish/CMIP6/CMIP/MIROC/MIROC-ES2L/amip/r3i1p1f2/CFmon/ta/gn/v20200903/
MIROC6: /p/css03/esgf_publish/CMIP6/CMIP/MIROC/MIROC6/amip/r2i1p1f1/CFmon/ta/gn/v20190311/
NorESM2-LM: /p/css03/esgf_publish/CMIP6/CMIP/NCC/NorESM2-LM/amip/r1i1p2f1/CFmon/ta/gn/v20210319/
BCC-CSM2-MR: /p/css03/esgf_publish/CMIP6/CFMIP/BCC/BCC-CSM2-MR/amip/r1i1p1f1/CFmon/ta/gn/v20190801/

Dataset with an error on ds.cf.decode_vertical_coords(outnames={'lev': 'plev'})

IPSL-CM5B-LR: /p/css03/cmip5_css01/data/cmip5/output1/IPSL/IPSL-CM5B-LR/amip/mon/atmos/cfMon/r1i1p1/v20120526/ta/
IPSL-CM5A-MR: /p/css03/cmip5_css01/data/cmip5/output1/IPSL/IPSL-CM5A-MR/amip/mon/atmos/cfMon/r1i1p1/v20120804/ta/
IPSL-CM5A-LR: /p/css03/cmip5_css01/data/cmip5/output1/IPSL/IPSL-CM5A-LR/amip/mon/atmos/cfMon/r1i1p1/v20111119/ta/
MPI-ESM-LR: /p/css03/esgf_publish/cmip5/output1/MPI-M/MPI-ESM-LR/amip/mon/atmos/cfMon/r1i1p1/v20120215/ta/
CanAM4: /p/css03/cmip5_css02/data/cmip5/output1/CCCma/CanAM4/amip/mon/atmos/cfMon/r1i1p1/ta/1/
UKESM1-0-LL: /p/css03/esgf_publish/CMIP6/CMIP/MOHC/UKESM1-0-LL/amip/r1i1p1f4/CFmon/ta/gn/v20210817/
HadGEM3-GC31-LL: /p/css03/esgf_publish/CMIP6/CMIP/MOHC/HadGEM3-GC31-LL/amip/r5i1p1f3/CFmon/ta/gn/v20210427/
CanESM5: /p/css03/esgf_publish/CMIP6/CMIP/CCCma/CanESM5/amip/r2i1p1f1/CFmon/ta/gn/v20190429/
MPI-ESM-1-2-HAM: /p/css03/esgf_publish/CMIP6/CMIP/HAMMOZ-Consortium/MPI-ESM-1-2-HAM/amip/r2i1p1f1/CFmon/ta/gn/v20190628/
EC-Earth3-AerChem: /p/css03/esgf_publish/CMIP6/CMIP/EC-Earth-Consortium/EC-Earth3-AerChem/amip/r1i1p1f1/CFmon/ta/gr/v20200910/
MPI-ESM1-2-LR: /p/css03/esgf_publish/CMIP6/CMIP/MPI-M/MPI-ESM1-2-LR/amip/r2i1p1f1/CFmon/ta/gn/v20190815/
MPI-ESM1-2-HR: /p/css03/esgf_publish/CMIP6/CMIP/MPI-M/MPI-ESM1-2-HR/amip/r2i1p1f1/CFmon/ta/gn/v20190710/

Datasets with no error, but do not produce a plev field

CNRM-CM6-1: /p/css03/esgf_publish/CMIP6/CMIP/CNRM-CERFACS/CNRM-CM6-1/amip/r1i1p1f2/CFmon/ta/gr/v20181203/
CNRM-ESM2-1: /p/css03/esgf_publish/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/amip/r1i1p1f2/CFmon/ta/gr/v20181205/
CESM2-WACCM: /p/css03/esgf_publish/CMIP6/CMIP/NCAR/CESM2-WACCM/amip/r2i1p1f1/CFmon/ta/gn/v20190220/
CESM2: /p/css03/esgf_publish/CMIP6/CMIP/NCAR/CESM2/amip/r1i1p1f1/CFmon/ta/gn/v20190218/
GFDL-CM4: /p/css03/esgf_publish/CMIP6/CMIP/NOAA-GFDL/GFDL-CM4/amip/r1i1p1f1/CFmon/ta/gr1/v20180701/
IPSL-CM6A-MR1: /p/css03/esgf_publish/CMIP6/CMIP/IPSL/IPSL-CM6A-MR1/amip/r1i1p1f1/CFmon/ta/gr/v20230220/
IPSL-CM6A-LR: /p/css03/esgf_publish/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/amip/r2i1p1f1/CFmon/ta/gr/v20181109/

@dcherian
Copy link

Now that is amazing.

Given

The choice of whether a(k) or ap(k) is used depends on model formulation; the former is a dimensionless fraction, the latter a pressure value.

perhaps we can check units to decide which representation to use.

@pochedls
Copy link
Collaborator

FYI – I had focused on testing atmospheric vertical coordinates. Ideally we could get some help testing ocean coordinates (@durack1?); otherwise, this might take awhile to stress test. Although I think CMIP data I've used is mostly on depth coordinates: are there CMIP ocean variables that are typically on coordinate systems that would need to be "decoded" (ocean sigma, s-coordinate, etc.)?

@jasonb5
Copy link
Collaborator

jasonb5 commented Nov 20, 2024

@pochedls Can you check these files again, the current implementation will check that the dataset contains either one or the other form of atmosphere_hybrid_sigma_pressure_coordinate and raise an appropriate error.

As for xCDAT's support, currently target_data can be a str or xr.DataArray. This assumes the decoded DataArray is in the Dataset or has been decoded externally. As another option we could accept xr.Dataset and try to call ds.cf.decode_vertical_coords to populate target_data for the user.

@pochedls
Copy link
Collaborator

@jasonb5 - I am wondering if I am looking at the right thing. I grabbed the latest cf-xarray and nothing is working (with the same code). One example, from a model that worked before:

I'm working with cf-xarray main (15259).

# open file
fn = '/p/css03/cmip5_css01/data/cmip5/output1/CNRM-CERFACS/CNRM-CM5/amip/mon/atmos/cfMon/r1i1p1/v20111006/ta/ta_cfMon_CNRM-CM5_amip_r1i1p1_197901-200812.nc'
ds = xc.open_dataset(fn)
# only operate on a few time steps
ds = ds.isel(time=np.arange(10))
# show some metadata
print(ds.lev.standard_name)
print(ds.lev.formula)
# decode
ds.cf.decode_vertical_coords(outnames={'lev': 'plev'})

atmosphere_hybrid_sigma_pressure_coordinate
p = ap0 + bps

KeyError Traceback (most recent call last)
Cell In[16], line 1
----> 1 ds.cf.decode_vertical_coords(outnames={'lev': 'plev'})

File ~/bin/anaconda3/envs/xcdat/lib/python3.12/site-packages/cf_xarray/accessor.py:2793, in CFDatasetAccessor.decode_vertical_coords(self, outnames, prefix)
2787 raise KeyError(
2788 f"Variable {value!r} is required to decode coordinate for {dim!r}"
2789 " but it is absent in the Dataset."
2790 )
2791 terms[key] = ds[value]
-> 2793 absent_terms = requirements[stdname] - set(terms)
2794 if absent_terms:
2795 raise KeyError(f"Required terms {absent_terms} absent in dataset.")

KeyError: 'atmosphere_hybrid_sigma_pressure_coordinate'

@jasonb5
Copy link
Collaborator

jasonb5 commented Nov 21, 2024

@pochedls That error looks like it's from an old version of the code prior to my PR. That line is no longer there on main.

@pochedls
Copy link
Collaborator

@jasonb5 – sorry about that. I think I was working with a imported/cached cf-xarray import.

The code above is working, but ds.plev is strange:

<xarray.DataArray 'plev' ()> Size: 8B
array(AtmosphereHybridSigmaPressure(b=<xarray.DataArray 'b' (lev: 31)> Size: 248B
[31 values with dtype=float64]
Coordinates:

  • lev (lev) float64 248B 0.9961 0.9826 0.959 ... 0.04936 0.02962 0.009872
    Attributes:
    long_name: vertical coordinate formula term: b(k)
    original_units: 1
    history: 2011-05-17T15:25:36Z altered by CMOR: Converted units fr..., ps=<xarray.DataArray 'ps' (time: 10, lat: 128, lon: 256)> Size: 1MB
    [327680 values with dtype=float32]
    Coordinates:
  • lat (lat) float64 1kB -88.93 -87.54 -86.14 -84.74 ... 86.14 87.54 88.93
  • lon (lon) float64 2kB 0.0 1.406 2.812 4.219 ... 354.4 355.8 357.2 358.6
  • time (time) object 80B 1979-01-16 12:00:00 ... 1979-10-16 12:00:00
    Attributes:
    standard_name: surface_air_pressure
    long_name: Surface Air Pressure
    comment: not, in general, the same as mean sea-level pressure
    units: Pa
    cell_methods: time: mean, p0=<xarray.DataArray 'p0' ()> Size: 4B
    [1 values with dtype=float32]
    Attributes:
    long_name: vertical coordinate formula term: reference pressure
    units: Pa, a=<xarray.DataArray 'a' (lev: 31)> Size: 248B
    [31 values with dtype=float64]
    Coordinates:
  • lev (lev) float64 248B 0.9961 0.9826 0.959 ... 0.04936 0.02962 0.009872
    Attributes:
    long_name: vertical coordinate formula term: a(k)
    original_units: 1
    history: 2011-05-17T15:25:36Z altered by CMOR: Converted units fr..., ap=None),
    dtype=object)
    Coordinates:
    plev object 8B AtmosphereHybridSigmaPressure(b=<xarray.DataArray 'b' ...

@pochedls
Copy link
Collaborator

@jasonb5 – regarding the resulting ds.plev object being weird – I think this probably has a different structure than it did before:

ds.plev.shape

()

ds.plev.to_numpy()

array(AtmosphereHybridSigmaPressure(b=<xarray.DataArray 'b' (lev: 31)> Size: 248B
[31 values with dtype=float64]
Coordinates:

  • lev (lev) float64 248B 0.9961 0.9826 0.959 ... 0.04936 0.02962 0.009872
    plev object 8B AtmosphereHybridSigmaPressure(b=<xarray.DataArray 'b' ...
    Attributes:
    long_name: vertical coordinate formula term: b(k)
    original_units: 1
    history: 2011-05-17T15:25:36Z altered by CMOR: Converted units fr..., ps=<xarray.DataArray 'ps' (time: 10, lat: 128, lon: 256)> Size: 1MB
    [327680 values with dtype=float32]
    Coordinates:
  • lat (lat) float64 1kB -88.93 -87.54 -86.14 -84.74 ... 86.14 87.54 88.93
  • lon (lon) float64 2kB 0.0 1.406 2.812 4.219 ... 354.4 355.8 357.2 358.6
  • time (time) object 80B 1979-01-16 12:00:00 ... 1979-10-16 12:00:00
    plev object 8B AtmosphereHybridSigmaPressure(b=<xarray.DataArray 'b' ...
    Attributes:
    standard_name: surface_air_pressure
    long_name: Surface Air Pressure
    comment: not, in general, the same as mean sea-level pressure
    units: Pa
    cell_methods: time: mean, p0=<xarray.DataArray 'p0' ()> Size: 4B
    [1 values with dtype=float32]
    Coordinates:
    plev object 8B AtmosphereHybridSigmaPressure(b=<xarray.DataArray 'b' ...
    Attributes:
    long_name: vertical coordinate formula term: reference pressure
    units: Pa, a=<xarray.DataArray 'a' (lev: 31)> Size: 248B
    [31 values with dtype=float64]
    Coordinates:
  • lev (lev) float64 248B 0.9961 0.9826 0.959 ... 0.04936 0.02962 0.009872
    plev object 8B AtmosphereHybridSigmaPressure(b=<xarray.DataArray 'b' ...
    Attributes:
    long_name: vertical coordinate formula term: a(k)
    original_units: 1
    history: 2011-05-17T15:25:36Z altered by CMOR: Converted units fr..., ap=None),
    dtype=object)

It's almost like plev is embedded in itself or something...

But I think we expect a dataarray object that has the form [time, lev, lat, lon] with the values corresponding to the reconstructed pressure (I can't figure out how to get the pressure values out). This creates problems when passing this dataarray to the vertical regridder via dsr = ds.regridder.vertical("ta", nplev, tool="xgcm", method="linear", target_data=ds.plev).

@jasonb5
Copy link
Collaborator

jasonb5 commented Mar 12, 2025

@pochedls That's a bug, I'll put a fix into cf_xarray for that.

@tomvothecoder @lee1043 @chengzhuzhang As for integrating this into xCDAT, currently it's on the user to decode the variable then pass it to the method. We could add an option for target_data="infer" or target_data="auto" where xCDAT will attempt to decode the parametric vertical coordinate and pass it to xgcm. We'd still default to None which is just a straight remapping between Z coordinates.

Does that seem useful or fit into any workflow. I'm open to any other ideas of bringing this into xCDAT.

@jasonb5
Copy link
Collaborator

jasonb5 commented Mar 15, 2025

@pochedls I've put in a fix for issue above (xarray-contrib/cf-xarray#561).

@dcherian
Copy link

Issued a release with that fix. Thanks @jasonb5

@pochedls
Copy link
Collaborator

Thanks for getting this upgraded @jasonb5 and @dcherian!

This version appears to be more robust with 10 datasets working that were not working before. decode_vertical_coords seems to work on 26/35 datasets now (i.e., the output dimensions match what I would expect – I didn't plot everything as an additional sanity check).

Residual issues:

  • The two models with a decode error have a rarely used vertical grid (uses altitude information), so it may not be critical that we get this to work, but the error I am getting is a key error – so this might be worth just checking to make sure that we should not be able to decode this dataset.
  • I don't know what is going on with the silent failure datasets. Checkout CNRM-CM6-1. It runs without producing an error/warning, but does not result in plev data.
  • The no_lev dataset is weird (presnivs as a vertical coordinate). If I change the decode call to ds.cf.decode_vertical_coords(outnames={'presnivs': 'plev'}), should this work? It doesn't, but maybe this doesn't conform to CF-conventions.

Datasets that work now, but previously had an error on ds.cf.decode_vertical_coords(outnames={'lev': 'plev'})
IPSL-CM5B-LR: /p/css03/cmip5_css01/data/cmip5/output1/IPSL/IPSL-CM5B-LR/amip/mon/atmos/cfMon/r1i1p1/v20120526/ta/
IPSL-CM5A-MR: /p/css03/cmip5_css01/data/cmip5/output1/IPSL/IPSL-CM5A-MR/amip/mon/atmos/cfMon/r1i1p1/v20120804/ta/
IPSL-CM5A-LR: /p/css03/cmip5_css01/data/cmip5/output1/IPSL/IPSL-CM5A-LR/amip/mon/atmos/cfMon/r1i1p1/v20111119/ta/
MPI-ESM-LR: /p/css03/cmip5_css01/data/cmip5/output1/MPI-M/MPI-ESM-LR/amip/mon/atmos/cfMon/r1i1p1/v20120215/ta/
CanAM4: /p/css03/cmip5_css02/data/cmip5/output1/CCCma/CanAM4/amip/mon/atmos/cfMon/r1i1p1/ta/1/
CanESM5: /p/css03/esgf_publish/CMIP6/CMIP/CCCma/CanESM5/amip/r2i1p1f1/CFmon/ta/gn/v20190429/
MPI-ESM-1-2-HAM: /p/css03/esgf_publish/CMIP6/CMIP/HAMMOZ-Consortium/MPI-ESM-1-2-HAM/amip/r2i1p1f1/CFmon/ta/gn/v20190628/
EC-Earth3-AerChem: /p/css03/esgf_publish/CMIP6/CMIP/EC-Earth-Consortium/EC-Earth3-AerChem/amip/r1i1p1f1/CFmon/ta/gr/v20200910/
MPI-ESM1-2-LR: /p/css03/esgf_publish/CMIP6/CMIP/MPI-M/MPI-ESM1-2-LR/amip/r2i1p1f1/CFmon/ta/gn/v20190815/
MPI-ESM1-2-HR: /p/css03/esgf_publish/CMIP6/CMIP/MPI-M/MPI-ESM1-2-HR/amip/r2i1p1f1/CFmon/ta/gn/v20190710/

no_problem
CNRM-CM5: /p/css03/cmip5_css01/data/cmip5/output1/CNRM-CERFACS/CNRM-CM5/amip/mon/atmos/cfMon/r1i1p1/v20111006/ta/
bcc-csm1-1-m: /p/css03/cmip5_css01/data/cmip5/output1/BCC/bcc-csm1-1-m/amip/mon/atmos/cfMon/r1i1p1/v20120910/ta/
GFDL-CM3: /p/css03/cmip5_css01/data/cmip5/output1/NOAA-GFDL/GFDL-CM3/amip/mon/atmos/cfMon/r1i1p1/v20110601/ta/
CESM1-CAM5: /p/css03/cmip5_css01/data/cmip5/output1/NSF-DOE-NCAR/CESM1-CAM5/amip/mon/atmos/cfMon/r2i1p1/v20121129/ta/
HadGEM2-A: /p/css03/cmip5_css02/data/cmip5/output1/MOHC/HadGEM2-A/amip/mon/atmos/cfMon/r1i1p1/ta/1/
inmcm4: /p/css03/cmip5_css02/data/cmip5/output1/INM/inmcm4/amip/mon/atmos/cfMon/r1i1p1/ta/1/
MIROC5: /p/css03/cmip5_css02/data/cmip5/output1/MIROC/MIROC5/amip/mon/atmos/cfMon/r1i1p1/ta/1/
CCSM4: /p/css03/esgf_publish/cmip5/gdo2/cmip5/output1/NCAR/CCSM4/amip/mon/atmos/cfMon/r7i1p1/v20121031/ta/
INM-CM5-0: /p/css03/esgf_publish/CMIP6/CMIP/INM/INM-CM5-0/amip/r1i1p1f1/CFmon/ta/gr1/v20190610/
INM-CM4-8: /p/css03/esgf_publish/CMIP6/CMIP/INM/INM-CM4-8/amip/r1i1p1f1/CFmon/ta/gr1/v20190528/
TaiESM1: /p/css03/esgf_publish/CMIP6/CMIP/AS-RCEC/TaiESM1/amip/r1i1p1f1/CFmon/ta/gn/v20200817/
MRI-ESM2-0: /p/css03/esgf_publish/CMIP6/CMIP/MRI/MRI-ESM2-0/amip/r1i1p1f1/CFmon/ta/gn/v20190625/
MIROC-ES2L: /p/css03/esgf_publish/CMIP6/CMIP/MIROC/MIROC-ES2L/amip/r1i1p1f2/CFmon/ta/gn/v20200903/
MIROC6: /p/css03/esgf_publish/CMIP6/CMIP/MIROC/MIROC6/amip/r1i1p1f1/CFmon/ta/gn/v20190311/
NorESM2-LM: /p/css03/esgf_publish/CMIP6/CMIP/NCC/NorESM2-LM/amip/r1i1p2f1/CFmon/ta/gn/v20210319/
BCC-CSM2-MR: /p/css03/esgf_publish/CMIP6/CFMIP/BCC/BCC-CSM2-MR/amip/r1i1p1f1/CFmon/ta/gn/v20190801/

decode_error
UKESM1-0-LL: /p/css03/esgf_publish/CMIP6/CMIP/MOHC/UKESM1-0-LL/amip/r1i1p1f4/CFmon/ta/gn/v20210817/
HadGEM3-GC31-LL: /p/css03/esgf_publish/CMIP6/CMIP/MOHC/HadGEM3-GC31-LL/amip/r5i1p1f3/CFmon/ta/gn/v20210427/

silent_fail
CNRM-CM6-1: /p/css03/esgf_publish/CMIP6/CMIP/CNRM-CERFACS/CNRM-CM6-1/amip/r1i1p1f2/CFmon/ta/gr/v20181203/
CNRM-ESM2-1: /p/css03/esgf_publish/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/amip/r1i1p1f2/CFmon/ta/gr/v20181205/
CESM2-WACCM: /p/css03/esgf_publish/CMIP6/CMIP/NCAR/CESM2-WACCM/amip/r2i1p1f1/CFmon/ta/gn/v20190220/
CESM2: /p/css03/esgf_publish/CMIP6/CMIP/NCAR/CESM2/amip/r1i1p1f1/CFmon/ta/gn/v20190218/
GFDL-CM4: /p/css03/esgf_publish/CMIP6/CMIP/NOAA-GFDL/GFDL-CM4/amip/r1i1p1f1/CFmon/ta/gr1/v20180701/
IPSL-CM6A-MR1: /p/css03/esgf_publish/CMIP6/CMIP/IPSL/IPSL-CM6A-MR1/amip/r1i1p1f1/CFmon/ta/gr/v20230220/

no_lev
IPSL-CM6A-LR: /p/css03/esgf_publish/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/amip/r2i1p1f1/CFmon/ta/gr/v20181109/

@jasonb5
Copy link
Collaborator

jasonb5 commented Mar 25, 2025

@pochedls Great to hear! I'll look at the remaining failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
project: seats-fy25 A goal for SEATS FY25 type: enhancement New enhancement request
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

4 participants