Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • aaron.spring/s2s-ai-challenge-template
  • anthonypv_97/s2s-ai-challenge-template
  • anthonypv_97/s2s-ai-challenge-predictia
  • sandeep.sukumaran/s2s-ai-challenge-template
  • lucasdmarten/s2s-ai-challenge-template
  • otaviomf123/s2s-ai-challenge-template
  • utkuunalan/s2s-ai-challenge-template
  • utkuunalan/s2s-ai-challenge-envai
  • glabolhumar/s2s-ai-challenge-enveai
  • glabolhumar/s2s-ai-challenge-envai
  • 1557684138/s2s-ai-challenge-template
  • janer0/s2s-ai-challenge-template
  • luweispark/s2s-ai-challenge-template
  • luweispark/s2s-ai-challenge-tianing
  • 444375642/s2s-ai-challenge-onthego
  • rok.roskar/s2s-ai-challenge-template
  • wanwb1224/s2s-ai-challenge-template
  • 834586573/s2s-ai-challenge-template
  • suryadev/s2s-ai-challenge-template
  • suryadev/s2s-sps
  • rhkdgns322/s2s-ai-challenge-template
  • lorenzo.cavazzi.tech/s2s-ai-challenge-template-test
  • chprsandeep/s2s-ai-challenge-template
  • chprsandeep/s2s-ai-challenge-template-deeplearners
  • chprsandeep/s2s-ai-challenge-deeplearners
  • adam.bienkowski/s2s-ai-challenge-uconn
  • tasko.olevski/s2s-ai-challenge-template-test1
  • 605483660/s2s-ai-challenge-template-1
  • dattranoptimuss/s2s-ai-challenge-template
  • declan.finney/s2s-ai-challenge-template
  • hilla.gerstman/s2s-ai-challenge-template
  • maria.pyrina/s2s-ai-challenge-template
  • weiriche/s2s-ai-challenge-s2s-eth
  • lxywn96/s2s-ai-challenge-template
  • ken.takahashi.guevara/s2s-ai-challenge-senamhi
  • manpreet.phy/s2s-ai-challenge-template
  • rahul.s8396/s2s-ai-challenge-template
  • manmeet.singh/s2s-ai-challenge-template
  • manmeet.singh/s2s-ai-challenge-template-iitm-ut-austin
  • xiangyanfei212/s2s-ai-challenge-template
  • cheikhnoreyni.fall/s2s-ai-challenge-template
  • 1843402075/s2s-ai-challenge-template
  • priyanka.yadav/s2s-ai-challenge-template
  • priyanka.yadav/s2s-ai-challenge-s2s-eth
  • wanedahirou/s2s-ai-challenge-template
  • r08h19/s2s-ai-challenge-template
  • xueqy_666/s2s-ai-challenge-template
  • r08h19/s2s-ai-challenge-pink
  • 1727072371/s2s-ai-challenge-template
  • 1727072371/s2s-ai-challenge-templateandy
  • 1727072371/s2s-ai-challenge-templateandy1
  • jiehongx/s2s-ai-challenge-template
  • kwmski7/s2s-ai-challenge-template
  • lo.riches/s2s-ai-challenge-template
  • thmamouka/s2s-ai-challenge-agroapps
  • vvourlioti/s2s-ai-challenge-agroapps
  • dolkong400/s2s-ai-challenge-template
  • 1843402075/s2s-ai-challenge-123
  • daniel.steinfeld87/s2s-ai-challenge-kit-eth-ubern
  • jehangir_awan/s2s-ai-challenge-template
  • muhammad.haider/s2s-ai-challenge-template
  • rahul.s8396/s2s-ai-challenge-sa-india
  • mudithnirmala/s2s-ai-challenge-template
  • tao.sun/s2s-ai-challenge-template
  • rayjohnbell0/s2s-ai-challenge-template
  • lluis.palma/s2s-ai-challenge-bsc-es
  • daniel.janke/s2s-ai-challenge-daniel-janke
  • daniel.janke/s2s-ai-challenge-danieljanke
  • jordan.gierschendorf/s2s-ai-challenge-template
  • declan.finney/s2s-ai-challenge-swan
  • 1843402075/s2s-ai-challenge-1234
  • yixisi1505/s2s-ai-challenge-ylabaiers
  • hayakawa-gen1010/s2s-ai-challenge-template-ylabaiers
  • adounkpep/s2s-ai-challenge-pyaj
  • molina/s2s-ai-challenge-ncar-team1
  • molina/s2s-ai-challenge-ncar-team2
  • rmcsqrd/s2s-ai-challenge-explore
  • lxywn96/s2s-ai-challenge-template-new
  • lxywn96/s2s-ai-challenge-nuister-f1
  • b1gdaniel/s2s-ai-challenge-nuister-f2
  • xueqy_666/s2s-ai-challenge-xqy
  • xueqy_666/s2s-ai-challenge-nuister-f4
  • 1727072371/s2s-ai-challenge-nuister-f3
  • 1727072371/s2s-ai-challenge-nuister-f5
  • panglin0912/s2s-ai-challenge-nuister-f5
  • 1342071344/s2s-ai-challenge-template
  • 931072922/s2s-ai-challenge-test
  • 931072922/s2s-ai-challenge-test2
  • 931072922/s2s-ai-challenge-piesat
  • jaareval/s2s-ai-challenge-uatest
  • tasko.olevski/s2s-ai-challenge-template-test2
  • medakramzaytar/s2s-ai-challenge-tabola
  • kwibukabertrand/s2s-ai-challenge-template
  • roberto.garcia/s2s-ai-challenge
  • roberto.garcia/s2s-ai-challenge-mnt-cptec-inpe
  • tamer.shoubaki/s2s-ai-challenge-rssai
  • 1342071344/s2s-ai-challenge-teamname
  • 1342071344/s2s-ai-challenge-template0
  • thabangline/s2s-ai-challenge-template
  • 2101110593/s2s-ai-challenge-piesat
  • info2/s2s-ai-challenge-template
  • jordan.gierschendorf1/s2s-ai-challenge-template
  • deepkneko/s2s-ai-challenge-ylabaiers
  • gyin/s2s-ai-challenge-new
  • pmartineau.work/s2s-ai-challenge-template
  • awr/s2s-ai-challenge-template-awr
  • awr/s2s-ai-challenge-temp
  • tasko.olevski/s2s-ai-challenge-template-test3
  • awr/s2s-ai-challenge-template3
  • lluis.palma/s2s-ai-challenge-bsc
  • cheikhnoreyni.fall/s2s-ai-challenge-template-noreyni
  • cheikhnoreyni.fall/s2s-ai-challenge-template-noreynidioum
  • tamerthamoqa/s2s-ai-challenge-template
  • cheikhnoreyni.fall/s2s-ai-challenge-template-noreynilpa
  • damien.specq/s2s-ai-challenge-template
  • kjhall01/s2s-ai-challenge-kjhall01
  • bjoern.mayer92/s2s-ai-challenge-template-zmaw2
  • zhoushanglin100/s2s-ai-challenge-template
  • samguo_321/s2s-ai-challenge-bsc
  • samguo_321/s2s-ai-challenge-guoshan
  • medakramzaytar/s2s-ai-challenge-bb
  • ejiro.chiomaa/s2s-ai-challenge-ejiro
  • mayur/s2s-ai-challenge-template-mayur
  • btickell/s2s-ai-challenge-template-mayur
  • junjie.liu.china/s2s-ai-challenge-template
  • zhanglang/s2s-ai-challenge-template
  • adjebbar83/s2s-ai-challenge-template
  • 1765007740/s2s-ai-challenge-template
128 results
Show changes
Showing
with 3563 additions and 737 deletions
source diff could not be displayed: it is stored in LFS. Options to address this: view the blob.
File added
source diff could not be displayed: it is stored in LFS. Options to address this: view the blob.
File added
source diff could not be displayed: it is stored in LFS. Options to address this: view the blob.
File added
source diff could not be displayed: it is stored in LFS. Options to address this: view the blob.
No preview for this file type
......@@ -5,27 +5,28 @@ dependencies:
- xarray
# ML
- tensorflow
#- pytorch
- pytorch
# viz
- matplotlib-base
# - cartopy
# scoring
- xskillscore # includes sklearn
- xskillscore>=0.0.20 # includes sklearn
# data access
#- intake
#- fsspec
- intake
- fsspec
- zarr
- s3fs
#- intake-xarray
- intake-xarray
- cfgrib
#- pydap
#- h5netcdf
# - netcdf4#==1.5.1 # see https://github.com/pydata/xarray/issues/4925
- eccodes
- nc-time-axis
- pydap
- h5netcdf
- netcdf4
- pip
- pip:
- climetlab >= 0.7.0
- climetlab_s2s_ai_challenge >= 0.6.3
- climetlab >= 0.8.0
- climetlab_s2s_ai_challenge >= 0.7.1
- configargparse # for weatherbench
- netcdf4 # ==1.5.1 # see https://github.com/pydata/xarray/issues/4925
- git+https://github.com/phausamann/sklearn-xarray.git@develop
- netcdf4==1.5.4
prefix: "/opt/conda"
%% Cell type:markdown id: tags:
# Train ML model for predictions of week 3-4 & 5-6
This notebook create a Machine Learning `ML_model` to predict weeks 3-4 & 5-6 based on `S2S` weeks 3-4 & 5-6 forecasts and is compared to `CPC` observations for the [`s2s-ai-challenge`](https://s2s-ai-challenge.github.io/).
%% Cell type:markdown id: tags:
# Synopsis
%% Cell type:markdown id: tags:
## Method: `name`
- decription
- a few details
%% Cell type:markdown id: tags:
## Data used
Training-input for Machine Learning model:
- renku datasets, climetlab, IRIDL
Forecast-input for Machine Learning model:
- renku datasets, climetlab, IRIDL
Compare Machine Learning model forecast against ground truth:
- renku datasets, climetlab, IRIDL
%% Cell type:markdown id: tags:
## Resources used
for training, details in reproducibility
- platform: renku
- memory: 8 GB
- processors: 2 CPU
- storage required: 10 GB
%% Cell type:markdown id: tags:
## Safeguards
All points have to be [x] checked. If not, your submission is invalid.
Changes to the code after submissions are not possible, as the `commit` before the `tag` will be reviewed.
(Only in exceptions and if previous effort in reproducibility can be found, it may be allowed to improve readability and reproducibility after November 1st 2021.)
%% Cell type:markdown id: tags:
### Safeguards to prevent [overfitting](https://en.wikipedia.org/wiki/Overfitting?wprov=sfti1)
If the organizers suspect overfitting, your contribution can be disqualified.
- [ ] We didnt use 2020 observations in training (explicit overfitting and cheating)
- [ ] We didnt repeatedly verify my model on 2020 observations and incrementally improved my RPSS (implicit overfitting)
- [ ] We did not use 2020 observations in training (explicit overfitting and cheating)
- [ ] We did not repeatedly verify my model on 2020 observations and incrementally improved my RPSS (implicit overfitting)
- [ ] We provide RPSS scores for the training period with script `skill_by_year`, see in section 6.3 `predict`.
- [ ] We tried our best to prevent [data leakage](https://en.wikipedia.org/wiki/Leakage_(machine_learning)?wprov=sfti1).
- [ ] We honor the `train-validate-test` [split principle](https://en.wikipedia.org/wiki/Training,_validation,_and_test_sets). This means that the hindcast data is split into `train` and `validate`, whereas `test` is withheld.
- [ ] We did use `test` explicitly in training or implicitly in incrementally adjusting parameters.
- [ ] We did not use `test` explicitly in training or implicitly in incrementally adjusting parameters.
- [ ] We considered [cross-validation](https://en.wikipedia.org/wiki/Cross-validation_(statistics)).
%% Cell type:markdown id: tags:
### Safeguards for Reproducibility
Notebook/code must be independently reproducible from scratch by the organizers (after the competition), if not possible: no prize
- [ ] All training data is publicly available (no pre-trained private neural networks, as they are not reproducible for us)
- [ ] Code is well documented, readable and reproducible.
- [ ] Code to reproduce training and predictions should run within a day on the described architecture. If the training takes longer than a day, please justify why this is needed. Please do not submit training piplelines, which take weeks to train.
- [ ] Code to reproduce training and predictions is preferred to run within a day on the described architecture. If the training takes longer than a day, please justify why this is needed. Please do not submit training piplelines, which take weeks to train.
%% Cell type:markdown id: tags:
# Todos to improve template
This is just a demo.
- [ ] for both variables
- [ ] for both `lead_time`s
- [ ] ensure probabilistic prediction outcome with `category` dim
%% Cell type:markdown id: tags:
# Imports
%% Cell type:code id: tags:
``` python
from tensorflow.keras.layers import Input, Dense, Flatten
from tensorflow.keras.models import Sequential
import matplotlib.pyplot as plt
import xarray as xr
xr.set_options(display_style='text')
from dask.utils import format_bytes
import xskillscore as xs
```
%% Cell type:markdown id: tags:
# Get training data
preprocessing of input data may be done in separate notebook/script
%% Cell type:markdown id: tags:
## Hindcast
get weekly initialized hindcasts
%% Cell type:code id: tags:
``` python
# consider renku datasets
#! renku storage pull path
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
## Observations
corresponding to hindcasts
%% Cell type:code id: tags:
``` python
# consider renku datasets
#! renku storage pull path
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
# ML model
%% Cell type:code id: tags:
``` python
bs=32
import numpy as np
class DataGenerator(keras.utils.Sequence):
def __init__(self):
"""
Data generator
Template from https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly
Args:
"""
self.on_epoch_end()
# For some weird reason calling .load() earlier messes up the mean and std computations
if load: print('Loading data into RAM'); self.data.load()
def __len__(self):
'Denotes the number of batches per epoch'
return int(np.ceil(self.n_samples / self.batch_size))
def __getitem__(self, i):
'Generate one batch of data'
idxs = self.idxs[i * self.batch_size:(i + 1) * self.batch_size]
# got all nan if nans not masked
X = self.data.isel(time=idxs).fillna(0.).values
y = self.verif_data.isel(time=idxs).fillna(0.).values
return X, y
def on_epoch_end(self):
'Updates indexes after each epoch'
self.idxs = np.arange(self.n_samples)
if self.shuffle == True:
np.random.shuffle(self.idxs)
```
%% Cell type:markdown id: tags:
## data prep: train, valid, test
%% Cell type:code id: tags:
``` python
# time is the forecast_reference_time
time_train_start,time_train_end='2000','2017'
time_valid_start,time_valid_end='2018','2019'
time_test = '2020'
```
%% Cell type:code id: tags:
``` python
dg_train = DataGenerator()
```
%% Cell type:code id: tags:
``` python
dg_valid = DataGenerator()
```
%% Cell type:code id: tags:
``` python
dg_test = DataGenerator()
```
%% Cell type:markdown id: tags:
## `fit`
%% Cell type:code id: tags:
``` python
cnn = keras.models.Sequential([])
```
%% Cell type:code id: tags:
``` python
cnn.summary()
```
%% Cell type:code id: tags:
``` python
cnn.compile(keras.optimizers.Adam(1e-4), 'mse')
```
%% Cell type:code id: tags:
``` python
import warnings
warnings.simplefilter("ignore")
```
%% Cell type:code id: tags:
``` python
cnn.fit(dg_train, epochs=1, validation_data=dg_valid)
```
%% Cell type:markdown id: tags:
## `predict`
Create predictions and print `mean(variable, lead_time, longitude, weighted latitude)` RPSS for all years as calculated by `skill_by_year`. For now RPS, todo: change to RPSS.
Create predictions and print `mean(variable, lead_time, longitude, weighted latitude)` RPSS for all years as calculated by `skill_by_year`.
%% Cell type:code id: tags:
``` python
from scripts import skill_by_year
```
%% Cell type:code id: tags:
``` python
def create_predictions(model, dg):
"""Create non-iterative predictions"""
preds = model.predict(dg).squeeze()
# transform
return preds
```
%% Cell type:markdown id: tags:
### `predict` training period in-sample
%% Cell type:code id: tags:
``` python
preds_is = create_predictions(cnn, dg_train)
```
%% Cell type:code id: tags:
``` python
skill_by_year(preds_is)
```
%% Cell type:markdown id: tags:
### `predict` valid out-of-sample
%% Cell type:code id: tags:
``` python
preds_os = create_predictions(cnn, dg_valid)
```
%% Cell type:code id: tags:
``` python
skill_by_year(preds_os)
```
%% Cell type:markdown id: tags:
### `predict` test
%% Cell type:code id: tags:
``` python
preds_test = create_predictions(cnn, dg_test)
```
%% Cell type:code id: tags:
``` python
skill_by_year(preds_test)
```
%% Cell type:markdown id: tags:
# Submission
%% Cell type:code id: tags:
``` python
preds_test.sizes # expect: category(3), longitude, latitude, lead_time(2), forecast_time (53)
```
%% Cell type:code id: tags:
``` python
from scripts import assert_predictions_2020
assert_predictions_2020(preds_test)
```
%% Cell type:code id: tags:
``` python
preds_test.to_netcdf('../submissions/ML_prediction_2020.nc')
```
%% Cell type:code id: tags:
``` python
#!git add ../submissions/ML_prediction_2020.nc
#!git add ML_forecast_template.ipynb
```
%% Cell type:code id: tags:
``` python
#!git commit -m "commit submission for my_method_name" # whatever message you want
```
%% Cell type:code id: tags:
``` python
#!git tag "submission-my_method_name-0.0.1" # if this is to be checked by scorer, only the last submitted==tagged version will be considered
```
%% Cell type:code id: tags:
``` python
#!git push --tags
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
# Reproducibility
%% Cell type:markdown id: tags:
## memory
%% Cell type:code id: tags:
``` python
# https://phoenixnap.com/kb/linux-commands-check-memory-usage
!free -g
```
%% Cell type:markdown id: tags:
## CPU
%% Cell type:code id: tags:
``` python
!lscpu
```
%% Cell type:markdown id: tags:
## software
%% Cell type:code id: tags:
``` python
!conda list
```
%% Cell type:code id: tags:
``` python
```
......
%% Cell type:markdown id: tags:
# Train ML model to correct predictions of week 3-4 & 5-6
This notebook create a Machine Learning `ML_model` to predict weeks 3-4 & 5-6 based on `S2S` weeks 3-4 & 5-6 forecasts and is compared to `CPC` observations for the [`s2s-ai-challenge`](https://s2s-ai-challenge.github.io/).
%% Cell type:markdown id: tags:
# Synopsis
%% Cell type:markdown id: tags:
## Method: `ML-based mean bias reduction`
- calculate the ML-based bias from 2000-2019 deterministic ensemble mean forecast
- remove that the ML-based bias from 2020 forecast deterministic ensemble mean forecast
%% Cell type:markdown id: tags:
## Data used
type: renku datasets
Training-input for Machine Learning model:
- hindcasts of models:
- ECMWF: `ecmwf_hindcast-input_2000-2019_biweekly_deterministic.zarr`
Forecast-input for Machine Learning model:
- real-time 2020 forecasts of models:
- ECMWF: `ecmwf_forecast-input_2020_biweekly_deterministic.zarr`
Compare Machine Learning model forecast against against ground truth:
- `CPC` observations:
- `hindcast-like-observations_biweekly_deterministic.zarr`
- `forecast-like-observations_2020_biweekly_deterministic.zarr`
%% Cell type:markdown id: tags:
## Resources used
for training, details in reproducibility
- platform: MPI-M supercompute 1 Node
- memory: 64 GB
- processors: 36 CPU
- platform: renku
- memory: 8 GB
- processors: 2 CPU
- storage required: 10 GB
%% Cell type:markdown id: tags:
## Safeguards
All points have to be [x] checked. If not, your submission is invalid.
Changes to the code after submissions are not possible, as the `commit` before the `tag` will be reviewed.
(Only in exceptions and if previous effort in reproducibility can be found, it may be allowed to improve readability and reproducibility after November 1st 2021.)
%% Cell type:markdown id: tags:
### Safeguards to prevent [overfitting](https://en.wikipedia.org/wiki/Overfitting?wprov=sfti1)
If the organizers suspect overfitting, your contribution can be disqualified.
- [x] We didnt use 2020 observations in training (explicit overfitting and cheating)
- [x] We didnt repeatedly verify my model on 2020 observations and incrementally improved my RPSS (implicit overfitting)
- [x] We did not use 2020 observations in training (explicit overfitting and cheating)
- [x] We did not repeatedly verify my model on 2020 observations and incrementally improved my RPSS (implicit overfitting)
- [x] We provide RPSS scores for the training period with script `print_RPS_per_year`, see in section 6.3 `predict`.
- [x] We tried our best to prevent [data leakage](https://en.wikipedia.org/wiki/Leakage_(machine_learning)?wprov=sfti1).
- [x] We honor the `train-validate-test` [split principle](https://en.wikipedia.org/wiki/Training,_validation,_and_test_sets). This means that the hindcast data is split into `train` and `validate`, whereas `test` is withheld.
- [x] We did use `test` explicitly in training or implicitly in incrementally adjusting parameters.
- [x] We did not use `test` explicitly in training or implicitly in incrementally adjusting parameters.
- [x] We considered [cross-validation](https://en.wikipedia.org/wiki/Cross-validation_(statistics)).
%% Cell type:markdown id: tags:
### Safeguards for Reproducibility
Notebook/code must be independently reproducible from scratch by the organizers (after the competition), if not possible: no prize
- [x] All training data is publicly available (no pre-trained private neural networks, as they are not reproducible for us)
- [x] Code is well documented, readable and reproducible.
- [x] Code to reproduce training and predictions should run within a day on the described architecture. If the training takes longer than a day, please justify why this is needed. Please do not submit training piplelines, which take weeks to train.
- [x] Code to reproduce training and predictions is preferred to run within a day on the described architecture. If the training takes longer than a day, please justify why this is needed. Please do not submit training piplelines, which take weeks to train.
%% Cell type:markdown id: tags:
# Todos to improve template
This is just a demo.
- [ ] use multiple predictor variables and two predicted variables
- [ ] for both `lead_time`s in one go
- [ ] consider seasonality, for now all `forecast_time` months are mixed
- [ ] make probabilistic predictions with `category` dim, for now works deterministic
%% Cell type:markdown id: tags:
# Imports
%% Cell type:code id: tags:
``` python
from tensorflow.keras.layers import Input, Dense, Flatten
from tensorflow.keras.models import Sequential
import matplotlib.pyplot as plt
import xarray as xr
xr.set_options(display_style='text')
import numpy as np
from dask.utils import format_bytes
import xskillscore as xs
```
%% Output
/opt/conda/lib/python3.8/site-packages/xarray/backends/cfgrib_.py:27: UserWarning: Failed to load cfgrib - most likely there is a problem accessing the ecCodes library. Try `import cfgrib` to get the full error message
warnings.warn(
%% Cell type:markdown id: tags:
# Get training data
preprocessing of input data may be done in separate notebook/script
%% Cell type:markdown id: tags:
## Hindcast
get weekly initialized hindcasts
%% Cell type:code id: tags:
``` python
v='t2m'
```
%% Cell type:code id: tags:
``` python
# preprocessed as renku dataset
!renku storage pull ../data/ecmwf_hindcast-input_2000-2019_biweekly_deterministic.zarr
```
%% Output
Warning: Run CLI commands only from project's root directory.

%% Cell type:code id: tags:
``` python
hind_2000_2019 = xr.open_zarr("../data/ecmwf_hindcast-input_2000-2019_biweekly_deterministic.zarr", consolidated=True)#[v]
hind_2000_2019 = xr.open_zarr("../data/ecmwf_hindcast-input_2000-2019_biweekly_deterministic.zarr", consolidated=True)
```
%% Output
/opt/conda/lib/python3.8/site-packages/xarray/backends/plugins.py:61: RuntimeWarning: Engine 'cfgrib' loading failed:
/opt/conda/lib/python3.8/site-packages/gribapi/_bindings.cpython-38-x86_64-linux-gnu.so: undefined symbol: codes_bufr_key_is_header
warnings.warn(f"Engine {name!r} loading failed:\n{ex}", RuntimeWarning)
%% Cell type:code id: tags:
``` python
# preprocessed as renku dataset
!renku storage pull ../data/ecmwf_forecast-input_2020_biweekly_deterministic.zarr
```
%% Output
Warning: Run CLI commands only from project's root directory.

%% Cell type:code id: tags:
``` python
fct_2020 = xr.open_zarr("../data/ecmwf_forecast-input_2020_biweekly_deterministic.zarr", consolidated=True)#[v]
fct_2020 = xr.open_zarr("../data/ecmwf_forecast-input_2020_biweekly_deterministic.zarr", consolidated=True)
```
%% Cell type:markdown id: tags:
## Observations
corresponding to hindcasts
%% Cell type:code id: tags:
``` python
# preprocessed as renku dataset
!renku storage pull ../data/hindcast-like-observations_2000-2019_biweekly_deterministic.zarr
```
%% Output
Warning: Run CLI commands only from project's root directory.

%% Cell type:code id: tags:
``` python
obs_2000_2019 = xr.open_zarr("../data/hindcast-like-observations_2000-2019_biweekly_deterministic.zarr", consolidated=True)#[v]
```
%% Cell type:code id: tags:
``` python
# preprocessed as renku dataset
!renku storage pull ../data/forecast-like-observations_2020_biweekly_deterministic.zarr
```
%% Output
Warning: Run CLI commands only from project's root directory.

%% Cell type:code id: tags:
``` python
obs_2020 = xr.open_zarr("../data/forecast-like-observations_2020_biweekly_deterministic.zarr", consolidated=True)#[v]
```
%% Cell type:markdown id: tags:
# ML model
%% Cell type:markdown id: tags:
based on [Weatherbench](https://github.com/pangeo-data/WeatherBench/blob/master/quickstart.ipynb)
%% Cell type:code id: tags:
``` python
# run once only and dont commit
!git clone https://github.com/pangeo-data/WeatherBench/
```
%% Output
fatal: destination path 'WeatherBench' already exists and is not an empty directory.
%% Cell type:code id: tags:
``` python
import sys
sys.path.insert(1, 'WeatherBench')
from WeatherBench.src.train_nn import DataGenerator, PeriodicConv2D, create_predictions
import tensorflow.keras as keras
```
%% Cell type:code id: tags:
``` python
bs=32
import numpy as np
class DataGenerator(keras.utils.Sequence):
def __init__(self, fct, verif, lead_time, batch_size=bs, shuffle=True, load=True,
mean=None, std=None):
"""
Data generator for WeatherBench data.
Template from https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly
Args:
fct: forecasts from S2S models: xr.DataArray (xr.Dataset doesnt work properly)
verif: observations with same dimensionality (xr.Dataset doesnt work properly)
lead_time: Lead_time as in model
batch_size: Batch size
shuffle: bool. If True, data is shuffled.
load: bool. If True, datadet is loaded into RAM.
mean: If None, compute mean from data.
std: If None, compute standard deviation from data.
Todo:
- use number in a better way, now uses only ensemble mean forecast
- dont use .sel(lead_time=lead_time) to train over all lead_time at once
- be sensitive with forecast_time, pool a few around the weekofyear given
- use more variables as predictors
- predict more variables
"""
if isinstance(fct, xr.Dataset):
print('convert fct to array')
fct = fct.to_array().transpose(...,'variable')
self.fct_dataset=True
else:
self.fct_dataset=False
if isinstance(verif, xr.Dataset):
print('convert verif to array')
verif = verif.to_array().transpose(...,'variable')
self.verif_dataset=True
else:
self.verif_dataset=False
#self.fct = fct
self.batch_size = batch_size
self.shuffle = shuffle
self.lead_time = lead_time
self.fct_data = fct.transpose('forecast_time', ...).sel(lead_time=lead_time)
self.fct_mean = self.fct_data.mean('forecast_time').compute() if mean is None else mean
self.fct_std = self.fct_data.std('forecast_time').compute() if std is None else std
self.verif_data = verif.transpose('forecast_time', ...).sel(lead_time=lead_time)
self.verif_mean = self.verif_data.mean('forecast_time').compute() if mean is None else mean
self.verif_std = self.verif_data.std('forecast_time').compute() if std is None else std
# Normalize
self.fct_data = (self.fct_data - self.fct_mean) / self.fct_std
self.verif_data = (self.verif_data - self.verif_mean) / self.verif_std
self.n_samples = self.fct_data.forecast_time.size
self.forecast_time = self.fct_data.forecast_time
self.on_epoch_end()
# For some weird reason calling .load() earlier messes up the mean and std computations
if load:
# print('Loading data into RAM')
self.fct_data.load()
def __len__(self):
'Denotes the number of batches per epoch'
return int(np.ceil(self.n_samples / self.batch_size))
def __getitem__(self, i):
'Generate one batch of data'
idxs = self.idxs[i * self.batch_size:(i + 1) * self.batch_size]
# got all nan if nans not masked
X = self.fct_data.isel(forecast_time=idxs).fillna(0.).values
y = self.verif_data.isel(forecast_time=idxs).fillna(0.).values
return X, y
def on_epoch_end(self):
'Updates indexes after each epoch'
self.idxs = np.arange(self.n_samples)
if self.shuffle == True:
np.random.shuffle(self.idxs)
```
%% Cell type:code id: tags:
``` python
# 2 bi-weekly `lead_time`: week 3-4
lead = hind_2000_2019.isel(lead_time=0).lead_time
lead
```
%% Output
<xarray.DataArray 'lead_time' ()>
array(1209600000000000, dtype='timedelta64[ns]')
Coordinates:
lead_time timedelta64[ns] 14 days
Attributes:
comment: lead_time describes bi-weekly aggregates. The pd.Timedelta corr...
aggregate: The pd.Timedelta corresponds to the first day of a biweek...
description: Forecast period is the time interval between the forecast...
long_name: lead time
standard_name: forecast_period
week34_t2m: mean[14 days, 27 days]
week34_tp: 28 days minus 14 days
week56_t2m: mean[28 days, 41 days]
week56_tp: 42 days minus 28 days
%% Cell type:code id: tags:
``` python
# mask, needed?
hind_2000_2019 = hind_2000_2019.where(obs_2000_2019.isel(forecast_time=0, lead_time=0,drop=True).notnull())
```
%% Cell type:markdown id: tags:
## data prep: train, valid, test
[Use the hindcast period to split train and valid.](https://en.wikipedia.org/wiki/Training,_validation,_and_test_sets) Do not use the 2020 data for testing!
%% Cell type:code id: tags:
``` python
# time is the forecast_time
time_train_start,time_train_end='2000','2017' # train
time_valid_start,time_valid_end='2018','2019' # valid
time_test = '2020' # test
```
%% Cell type:code id: tags:
``` python
dg_train = DataGenerator(
hind_2000_2019.mean('realization').sel(forecast_time=slice(time_train_start,time_train_end))[v],
obs_2000_2019.sel(forecast_time=slice(time_train_start,time_train_end))[v],
lead_time=lead, batch_size=bs, load=True)
```
%% Output
/opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
x = np.divide(x1, x2, out)
/opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
x = np.divide(x1, x2, out)
/opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
x = np.divide(x1, x2, out)
/opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
x = np.divide(x1, x2, out)
/opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
x = np.divide(x1, x2, out)
/opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
x = np.divide(x1, x2, out)
%% Cell type:code id: tags:
``` python
dg_valid = DataGenerator(
hind_2000_2019.mean('realization').sel(forecast_time=slice(time_valid_start,time_valid_end))[v],
obs_2000_2019.sel(forecast_time=slice(time_valid_start,time_valid_end))[v],
lead_time=lead, batch_size=bs, shuffle=False, load=True)
```
%% Output
/opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
x = np.divide(x1, x2, out)
/opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
x = np.divide(x1, x2, out)
/opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
x = np.divide(x1, x2, out)
/opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
x = np.divide(x1, x2, out)
/opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
x = np.divide(x1, x2, out)
%% Cell type:code id: tags:
``` python
# do not use, delete?
dg_test = DataGenerator(
fct_2020.mean('realization').sel(forecast_time=time_test)[v],
obs_2020.sel(forecast_time=time_test)[v],
lead_time=lead, batch_size=bs, load=True, mean=dg_train.fct_mean, std=dg_train.fct_std, shuffle=False)
```
%% Cell type:code id: tags:
``` python
X, y = dg_valid[0]
X.shape, y.shape
```
%% Output
((32, 121, 240), (32, 121, 240))
%% Cell type:code id: tags:
``` python
# short look into training data: large biases
# any problem from normalizing?
i=4
xr.DataArray(np.vstack([X[i],y[i]])).plot(yincrease=False, robust=True)
# i=4
# xr.DataArray(np.vstack([X[i],y[i]])).plot(yincrease=False, robust=True)
```
%% Output
<matplotlib.collections.QuadMesh at 0x7f3a7e44b730>
%% Cell type:markdown id: tags:
## `fit`
%% Cell type:code id: tags:
``` python
cnn = keras.models.Sequential([
PeriodicConv2D(filters=32, kernel_size=5, conv_kwargs={'activation':'relu'}, input_shape=(32, 64, 1)),
PeriodicConv2D(filters=1, kernel_size=5)
])
```
%% Output
WARNING:tensorflow:AutoGraph could not transform <bound method PeriodicPadding2D.call of <WeatherBench.src.train_nn.PeriodicPadding2D object at 0x7f3a7c21bfd0>> and will run it as-is.
WARNING:tensorflow:AutoGraph could not transform <bound method PeriodicPadding2D.call of <WeatherBench.src.train_nn.PeriodicPadding2D object at 0x7f86042986a0>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module 'gast' has no attribute 'Index'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING: AutoGraph could not transform <bound method PeriodicPadding2D.call of <WeatherBench.src.train_nn.PeriodicPadding2D object at 0x7f3a7c21bfd0>> and will run it as-is.
WARNING: AutoGraph could not transform <bound method PeriodicPadding2D.call of <WeatherBench.src.train_nn.PeriodicPadding2D object at 0x7f86042986a0>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module 'gast' has no attribute 'Index'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
%% Cell type:code id: tags:
``` python
cnn.summary()
```
%% Output
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
periodic_conv2d (PeriodicCon (None, 32, 64, 32) 832
_________________________________________________________________
periodic_conv2d_1 (PeriodicC (None, 32, 64, 1) 801
=================================================================
Total params: 1,633
Trainable params: 1,633
Non-trainable params: 0
_________________________________________________________________
%% Cell type:code id: tags:
``` python
cnn.compile(keras.optimizers.Adam(1e-4), 'mse')
```
%% Cell type:code id: tags:
``` python
import warnings
warnings.simplefilter("ignore")
```
%% Cell type:code id: tags:
``` python
cnn.fit(dg_train, epochs=3, validation_data=dg_valid)
cnn.fit(dg_train, epochs=2, validation_data=dg_valid)
```
%% Output
Epoch 1/3
30/30 [==============================] - 24s 744ms/step - loss: 0.2325 - val_loss: 0.1270
Epoch 2/3
30/30 [==============================] - 22s 717ms/step - loss: 0.1188 - val_loss: 0.0791
Epoch 3/3
30/30 [==============================] - 22s 733ms/step - loss: 0.0766 - val_loss: 0.0620
Epoch 1/2
30/30 [==============================] - 58s 2s/step - loss: 0.1472 - val_loss: 0.0742
Epoch 2/2
30/30 [==============================] - 45s 1s/step - loss: 0.0712 - val_loss: 0.0545
<tensorflow.python.keras.callbacks.History at 0x7f3a880d2700>
<tensorflow.python.keras.callbacks.History at 0x7f865c2103d0>
%% Cell type:markdown id: tags:
## `predict`
Create predictions and print `mean(variable, lead_time, longitude, weighted latitude)` RPSS for all years as calculated by `skill_by_year`. For now RPS, todo: change to RPSS.
Create predictions and print `mean(variable, lead_time, longitude, weighted latitude)` RPSS for all years as calculated by `skill_by_year`.
%% Cell type:code id: tags:
``` python
from scripts import add_valid_time_from_forecast_reference_time_and_lead_time
def _create_predictions(model, dg, lead):
"""Create non-iterative predictions"""
preds = model.predict(dg).squeeze()
# Unnormalize
preds = preds * dg.fct_std.values + dg.fct_mean.values
if dg.verif_dataset:
da = xr.DataArray(
preds,
dims=['forecast_time', 'latitude', 'longitude','variable'],
coords={'forecast_time': dg.fct_data.forecast_time, 'latitude': dg.fct_data.latitude,
'longitude': dg.fct_data.longitude},
).to_dataset() # doesnt work yet
else:
da = xr.DataArray(
preds,
dims=['forecast_time', 'latitude', 'longitude'],
coords={'forecast_time': dg.fct_data.forecast_time, 'latitude': dg.fct_data.latitude,
'longitude': dg.fct_data.longitude},
)
da = da.assign_coords(lead_time=lead)
# da = add_valid_time_from_forecast_reference_time_and_lead_time(da)
return da
```
%% Cell type:code id: tags:
``` python
# optionally masking the ocean when making probabilistic
mask = obs_2020.std(['lead_time','forecast_time']).notnull()
```
%% Cell type:code id: tags:
``` python
from scripts import make_probabilistic
```
%% Cell type:code id: tags:
``` python
!renku storage pull ../data/hindcast-like-observations_2000-2019_biweekly_tercile-edges.nc
```
%% Output
Warning: Run CLI commands only from project's root directory.

%% Cell type:code id: tags:
``` python
cache_path='../data'
tercile_file = f'{cache_path}/hindcast-like-observations_2000-2019_biweekly_tercile-edges.nc'
tercile_edges = xr.open_dataset(tercile_file)
```
%% Cell type:code id: tags:
``` python
# this is not useful but results have expected dimensions
# actually train for each lead_time
def create_predictions(cnn, fct, obs, time):
preds_test=[]
for lead in fct.lead_time:
dg = DataGenerator(fct.mean('realization').sel(forecast_time=time)[v],
obs.sel(forecast_time=time)[v],
lead_time=lead, batch_size=bs, mean=dg_train.fct_mean, std=dg_train.fct_std, shuffle=False)
preds_test.append(_create_predictions(cnn, dg, lead))
preds_test = xr.concat(preds_test, 'lead_time')
preds_test['lead_time'] = fct.lead_time
# add valid_time coord
preds_test = add_valid_time_from_forecast_reference_time_and_lead_time(preds_test)
preds_test = preds_test.to_dataset(name=v)
# add fake var
preds_test['tp'] = preds_test['t2m']
# make probabilistic
preds_test = make_probabilistic(preds_test.expand_dims('realization'), tercile_edges, mask=mask)
return preds_test
```
%% Cell type:markdown id: tags:
### `predict` training period in-sample
%% Cell type:code id: tags:
``` python
!renku storage pull ../data/forecast-like-observations_2020_biweekly_terciled.nc
```
%% Output
Warning: Run CLI commands only from project's root directory.

%% Cell type:code id: tags:
``` python
!renku storage pull ../data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr
```
%% Output
Warning: Run CLI commands only from project's root directory.

%% Cell type:code id: tags:
``` python
from scripts import skill_by_year
```
%% Cell type:code id: tags:
``` python
step = 3
for year in np.arange(int(time_train_start), int(time_train_end) -1, step): # loop over years to consume less memory on renku
preds_is = create_predictions(cnn, hind_2000_2019, obs_2000_2019, time=slice(str(year), str(year+step-1))).compute()
print(skill_by_year(preds_is))
import os
if os.environ['HOME'] == '/home/jovyan':
import pandas as pd
# assume on renku with small memory
step = 2
skill_list = []
for year in np.arange(int(time_train_start), int(time_train_end) -1, step): # loop over years to consume less memory on renku
preds_is = create_predictions(cnn, hind_2000_2019, obs_2000_2019, time=slice(str(year), str(year+step-1))).compute()
skill_list.append(skill_by_year(preds_is))
skill = pd.concat(skill_list)
else: # with larger memory, simply do
preds_is = create_predictions(cnn, hind_2000_2019, obs_2000_2019, time=slice(time_train_start, time_train_end))
skill = skill_by_year(preds_is)
skill
```
%% Output
RPS
year
2000 0.864946
2001 0.944157
2002 0.975950
RPS
RPSS
year
2003 0.940808
2004 0.946251
2005 1.008755
RPS
year
2006 0.962032
2007 1.009629
2008 0.975747
RPS
year
2009 1.009750
2010 1.005306
2011 0.955670
RPS
year
2012 0.991964
2013 1.010457
2014 1.014053
RPS
year
2015 1.020807
2016 1.084565
2017 1.054406
%% Cell type:code id: tags:
``` python
# not on renkulab, simply do
# preds_is = create_predictions(cnn, hind_2000_2019, obs_2000_2019, time=slice(time_train_start, time_train_end))
# skill_by_year(preds_is)
```
2000 -0.862483
2001 -1.015485
2002 -1.101022
2003 -1.032647
2004 -1.056348
2005 -1.165675
2006 -1.057217
2007 -1.170849
2008 -1.049785
2009 -1.169108
2010 -1.130845
2011 -1.052670
2012 -1.126449
2013 -1.126930
2014 -1.095896
2015 -1.117486
%% Cell type:markdown id: tags:
### `predict` validation period out-of-sample
%% Cell type:code id: tags:
``` python
preds_os = create_predictions(cnn, hind_2000_2019, obs_2000_2019, time=slice(time_valid_start, time_valid_end))
skill_by_year(preds_os)
```
%% Output
RPS
RPSS
year
2018 1.045750
2019 1.097249
2018 -1.099744
2019 -1.172401
%% Cell type:markdown id: tags:
### `predict` test
%% Cell type:code id: tags:
``` python
preds_test = create_predictions(cnn, fct_2020, obs_2020, time=time_test)
print_RPS_per_year(preds_test)
skill_by_year(preds_test)
```
%% Output
RPS
RPSS
year
2020 1.052517
2020 -1.076834
%% Cell type:markdown id: tags:
# Submission
%% Cell type:code id: tags:
``` python
from scripts import assert_predictions_2020
assert_predictions_2020(preds_test)
```
%% Cell type:code id: tags:
``` python
preds_test.to_netcdf('../submissions/ML_prediction_2020.nc')
```
%% Cell type:code id: tags:
``` python
#!git add ../submissions/ML_prediction_2020.nc
# !git add ../submissions/ML_prediction_2020.nc
# !git add ML_train_and_prediction.ipynb
```
%% Cell type:code id: tags:
``` python
#!git commit -m "template_test commit message" # whatever message you want
# !git commit -m "template_test commit message" # whatever message you want
```
%% Cell type:code id: tags:
``` python
#!git tag "submission-template_test-0.0.1" # if this is to be checked by scorer, only the last submitted==tagged version will be considered
# !git tag "submission-template_test-0.0.1" # if this is to be checked by scorer, only the last submitted==tagged version will be considered
```
%% Cell type:code id: tags:
``` python
#!git push --tags
# !git push --tags
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
# Reproducibility
%% Cell type:markdown id: tags:
## memory
%% Cell type:code id: tags:
``` python
# https://phoenixnap.com/kb/linux-commands-check-memory-usage
!free -g
```
%% Output
total used free shared buff/cache available
Mem: 31 7 11 0 12 24
Swap: 0 0 0
%% Cell type:markdown id: tags:
## CPU
%% Cell type:code id: tags:
``` python
!lscpu
```
%% Output
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 40 bits physical, 48 bits virtual
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 8
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel Xeon Processor (Skylake, IBRS)
Stepping: 4
CPU MHz: 2095.078
BogoMIPS: 4190.15
Virtualization: VT-x
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 256 KiB
L1i cache: 256 KiB
L2 cache: 32 MiB
L3 cache: 128 MiB
NUMA node0 CPU(s): 0-7
Vulnerability Itlb multihit: KVM: Mitigation: Split huge pages
Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cach
e flushes, SMT disabled
Vulnerability Mds: Vulnerable: Clear CPU buffers attempted, no mic
rocode; SMT Host state unknown
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user
pointer sanitization
Vulnerability Spectre v2: Mitigation; Full generic retpoline, IBPB condit
ional, IBRS_FW, STIBP disabled, RSB filling
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtr
r pge mca cmov pat pse36 clflush mmx fxsr sse s
se2 syscall nx pdpe1gb rdtscp lm constant_tsc r
ep_good nopl xtopology cpuid tsc_known_freq pni
pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_
2 x2apic movbe popcnt tsc_deadline_timer aes xs
ave avx f16c rdrand hypervisor lahf_lm abm 3dno
wprefetch cpuid_fault invpcid_single pti ibrs i
bpb tpr_shadow vnmi flexpriority ept vpid ept_a
d fsgsbase bmi1 avx2 smep bmi2 erms invpcid avx
512f avx512dq rdseed adx smap clwb avx512cd avx
512bw avx512vl xsaveopt xsavec xgetbv1 arat pku
ospke
%% Cell type:markdown id: tags:
## software
%% Cell type:code id: tags:
``` python
!conda list
```
%% Output
# packages in environment at /opt/conda:
#
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 1_gnu conda-forge
_pytorch_select 0.1 cpu_0 defaults
_tflow_select 2.3.0 mkl defaults
absl-py 0.13.0 py38h06a4308_0 defaults
aiobotocore 1.4.1 pyhd3eb1b0_0 defaults
aiohttp 3.7.4.post0 py38h7f8727e_2 defaults
aioitertools 0.7.1 pyhd3eb1b0_0 defaults
alembic 1.4.3 pyh9f0ad1d_0 conda-forge
ansiwrap 0.8.4 pypi_0 pypi
appdirs 1.4.4 pypi_0 pypi
argcomplete 1.12.3 pypi_0 pypi
argon2-cffi 20.1.0 py38h497a2fe_2 conda-forge
argparse 1.4.0 pypi_0 pypi
asciitree 0.3.3 py_2 defaults
astor 0.8.1 py38h06a4308_0 defaults
astunparse 1.6.3 py_0 defaults
async-timeout 3.0.1 pypi_0 pypi
async_generator 1.10 py_0 conda-forge
attrs 21.2.0 pypi_0 pypi
backcall 0.2.0 pyh9f0ad1d_0 conda-forge
backports 1.0 py_2 conda-forge
backports.functools_lru_cache 1.6.1 py_0 conda-forge
bagit 1.8.1 pypi_0 pypi
beautifulsoup4 4.10.0 pyh06a4308_0 defaults
binutils_impl_linux-64 2.35.1 h193b22a_1 conda-forge
binutils_linux-64 2.35 h67ddf6f_30 conda-forge
black 20.8b1 pypi_0 pypi
blas 1.0 mkl defaults
bleach 3.2.1 pyh9f0ad1d_0 conda-forge
blinker 1.4 py_1 conda-forge
bokeh 2.3.3 py38h06a4308_0 defaults
botocore 1.20.106 pyhd3eb1b0_0 defaults
bottleneck 1.3.2 py38heb32a55_1 defaults
bracex 2.1.1 pypi_0 pypi
branca 0.3.1 pypi_0 pypi
brotli 1.0.9 he6710b0_2 defaults
brotlipy 0.7.0 py38h497a2fe_1001 conda-forge
bzip2 1.0.8 h7f98852_4 conda-forge
c-ares 1.17.1 h36c2ea0_0 conda-forge
ca-certificates 2021.7.5 h06a4308_1 defaults
cachecontrol 0.12.6 pypi_0 pypi
cachetools 4.2.4 pypi_0 pypi
calamus 0.3.12 pypi_0 pypi
cdsapi 0.5.1 pypi_0 pypi
certifi 2021.5.30 pypi_0 pypi
certipy 0.1.3 py_0 conda-forge
cffi 1.14.6 pypi_0 pypi
cfgrib 0.9.9.0 pyhd8ed1ab_1 conda-forge
cftime 1.5.0 py38h6323ea4_0 defaults
chardet 3.0.4 pypi_0 pypi
click 7.1.2 pypi_0 pypi
click-completion 0.5.2 pypi_0 pypi
click-option-group 0.5.3 pypi_0 pypi
click-plugins 1.1.1 pypi_0 pypi
climetlab 0.8.31 pypi_0 pypi
climetlab-s2s-ai-challenge 0.8.0 pypi_0 pypi
cloudpickle 2.0.0 pyhd3eb1b0_0 defaults
colorama 0.4.4 pypi_0 pypi
coloredlogs 15.0.1 pypi_0 pypi
commonmark 0.9.1 pypi_0 pypi
conda 4.9.2 py38h578d9bd_0 conda-forge
conda-package-handling 1.7.2 py38h8df0ef7_0 conda-forge
configargparse 1.5.2 pypi_0 pypi
configurable-http-proxy 1.3.0 0 conda-forge
coverage 5.5 py38h27cfd23_2 defaults
cryptography 3.4.8 pypi_0 pypi
curl 7.71.1 he644dc0_8 conda-forge
cwlgen 0.4.2 pypi_0 pypi
cwltool 3.1.20211004060744 pypi_0 pypi
cycler 0.10.0 py38_0 defaults
cython 0.29.24 py38h295c915_0 defaults
cytoolz 0.11.0 py38h7b6447c_0 defaults
dask 2021.8.1 pyhd3eb1b0_0 defaults
dask-core 2021.8.1 pyhd3eb1b0_0 defaults
dataclasses 0.8 pyh6d0b6a4_7 defaults
decorator 4.4.2 py_0 conda-forge
defusedxml 0.6.0 py_0 conda-forge
distributed 2021.8.1 py38h06a4308_0 defaults
distro 1.5.0 pypi_0 pypi
docopt 0.6.2 py38h06a4308_0 defaults
eccodes 2.21.0 ha0e6eb6_0 conda-forge
ecmwf-api-client 1.6.1 pypi_0 pypi
ecmwflibs 0.3.14 pypi_0 pypi
entrypoints 0.3 pyhd8ed1ab_1003 conda-forge
environ-config 21.2.0 pypi_0 pypi
fasteners 0.16.3 pyhd3eb1b0_0 defaults
filelock 3.0.12 pypi_0 pypi
findlibs 0.0.2 pypi_0 pypi
fonttools 4.25.0 pyhd3eb1b0_0 defaults
freetype 2.10.4 h5ab3b9f_0 defaults
frozendict 2.0.6 pypi_0 pypi
fsspec 2021.7.0 pyhd3eb1b0_0 defaults
gast 0.4.0 pyhd3eb1b0_0 defaults
gcc_impl_linux-64 9.3.0 h70c0ae5_18 conda-forge
gcc_linux-64 9.3.0 hf25ea35_30 conda-forge
gitdb 4.0.7 pypi_0 pypi
gitpython 3.1.14 pypi_0 pypi
google-auth 1.33.0 pyhd3eb1b0_0 defaults
google-auth-oauthlib 0.4.4 pyhd3eb1b0_0 defaults
google-pasta 0.2.0 pyhd3eb1b0_0 defaults
grpcio 1.36.1 py38h2157cd5_1 defaults
gxx_impl_linux-64 9.3.0 hd87eabc_18 conda-forge
gxx_linux-64 9.3.0 h3fbe746_30 conda-forge
h5netcdf 0.11.0 pyhd8ed1ab_0 conda-forge
h5py 2.10.0 py38hd6299e0_1 defaults
hdf4 4.2.13 h3ca952b_2 defaults
hdf5 1.10.6 nompi_h6a2412b_1114 conda-forge
heapdict 1.0.1 pyhd3eb1b0_0 defaults
humanfriendly 10.0 pypi_0 pypi
humanize 3.7.1 pypi_0 pypi
icu 68.1 h58526e2_0 conda-forge
idna 2.10 pyh9f0ad1d_0 conda-forge
importlib-metadata 3.4.0 py38h578d9bd_0 conda-forge
importlib_metadata 3.4.0 hd8ed1ab_0 conda-forge
intake 0.6.3 pyhd3eb1b0_0 defaults
intake-xarray 0.5.0 pyhd3eb1b0_0 defaults
intel-openmp 2019.4 243 defaults
ipykernel 5.4.2 py38h81c977d_0 conda-forge
ipython 7.19.0 py38h81c977d_2 conda-forge
ipython_genutils 0.2.0 py_1 conda-forge
isodate 0.6.0 pypi_0 pypi
jasper 1.900.1 hd497a04_4 defaults
jedi 0.17.2 py38h578d9bd_1 conda-forge
jellyfish 0.8.8 pypi_0 pypi
jinja2 3.0.1 pypi_0 pypi
jmespath 0.10.0 pyhd3eb1b0_0 defaults
joblib 1.0.1 pyhd3eb1b0_0 defaults
jpeg 9d h7f8727e_0 defaults
json5 0.9.5 pyh9f0ad1d_0 conda-forge
jsonschema 3.2.0 py_2 conda-forge
jupyter-server-proxy 1.6.0 pypi_0 pypi
jupyter_client 6.1.11 pyhd8ed1ab_1 conda-forge
jupyter_core 4.7.0 py38h578d9bd_0 conda-forge
jupyter_telemetry 0.1.0 pyhd8ed1ab_1 conda-forge
jupyterhub 1.2.2 pypi_0 pypi
jupyterlab 2.2.9 py_0 conda-forge
jupyterlab-git 0.23.3 pypi_0 pypi
jupyterlab_pygments 0.1.2 pyh9f0ad1d_0 conda-forge
jupyterlab_server 1.2.0 py_0 conda-forge
keras-preprocessing 1.1.2 pyhd3eb1b0_0 defaults
kernel-headers_linux-64 2.6.32 h77966d4_13 conda-forge
kiwisolver 1.3.1 py38h2531618_0 defaults
krb5 1.17.2 h926e7f8_0 conda-forge
lazy-object-proxy 1.6.0 pypi_0 pypi
lcms2 2.12 h3be6417_0 defaults
ld_impl_linux-64 2.35.1 hea4e1c9_1 conda-forge
libaec 1.0.4 he6710b0_1 defaults
libblas 3.9.0 1_h86c2bf4_netlib conda-forge
libcblas 3.9.0 5_h92ddd45_netlib conda-forge
libcurl 7.71.1 hcdd3856_8 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libffi 3.3 h58526e2_2 conda-forge
libgcc-devel_linux-64 9.3.0 h7864c58_18 conda-forge
libgcc-ng 9.3.0 h2828fa1_18 conda-forge
libgfortran-ng 9.3.0 ha5ec8a7_17 defaults
libgfortran5 9.3.0 ha5ec8a7_17 defaults
libgomp 9.3.0 h2828fa1_18 conda-forge
liblapack 3.9.0 5_h92ddd45_netlib conda-forge
libllvm10 10.0.1 hbcb73fb_5 defaults
libmklml 2019.0.5 0 defaults
libnetcdf 4.7.4 nompi_h56d31a8_107 conda-forge
libnghttp2 1.41.0 h8cfc5f6_2 conda-forge
libpng 1.6.37 hbc83047_0 defaults
libprotobuf 3.17.2 h4ff587b_1 defaults
libsodium 1.0.18 h36c2ea0_1 conda-forge
libssh2 1.9.0 hab1572f_5 conda-forge
libstdcxx-devel_linux-64 9.3.0 hb016644_18 conda-forge
libstdcxx-ng 9.3.0 h6de172a_18 conda-forge
libtiff 4.2.0 h85742a9_0 defaults
libuv 1.40.0 h7f98852_0 conda-forge
libwebp-base 1.2.0 h27cfd23_0 defaults
llvmlite 0.36.0 py38h612dafd_4 defaults
locket 0.2.1 py38h06a4308_1 defaults
lockfile 0.12.2 pypi_0 pypi
lxml 4.6.3 pypi_0 pypi
lz4-c 1.9.3 h295c915_1 defaults
magics 1.5.6 pypi_0 pypi
mako 1.1.4 pyh44b312d_0 conda-forge
markdown 3.3.4 py38h06a4308_0 defaults
markupsafe 2.0.1 pypi_0 pypi
marshmallow 3.13.0 pypi_0 pypi
matplotlib-base 3.4.2 py38hab158f2_0 defaults
mistune 0.8.4 py38h497a2fe_1003 conda-forge
mkl 2020.2 256 defaults
mkl-service 2.3.0 py38he904b0f_0 defaults
mkl_fft 1.3.0 py38h54f3939_0 defaults
mkl_random 1.1.1 py38h0573a6f_0 defaults
msgpack-python 1.0.2 py38hff7bd54_1 defaults
multidict 5.1.0 py38h27cfd23_2 defaults
munkres 1.1.4 py_0 defaults
mypy-extensions 0.4.3 pypi_0 pypi
nbclient 0.5.0 pypi_0 pypi
nbconvert 6.0.7 py38h578d9bd_3 conda-forge
nbdime 2.1.0 pypi_0 pypi
nbformat 5.1.2 pyhd8ed1ab_1 conda-forge
nbresuse 0.4.0 pypi_0 pypi
nc-time-axis 1.3.1 pyhd8ed1ab_2 conda-forge
ncurses 6.2 h58526e2_4 conda-forge
ndg-httpsclient 0.5.1 pypi_0 pypi
nest-asyncio 1.4.3 pyhd8ed1ab_0 conda-forge
netcdf4 1.5.4 pypi_0 pypi
networkx 2.6.3 pypi_0 pypi
ninja 1.10.2 hff7bd54_1 defaults
nodejs 15.3.0 h25f6087_0 conda-forge
notebook 6.2.0 py38h578d9bd_0 conda-forge
numba 0.53.1 py38ha9443f7_0 defaults
numcodecs 0.8.0 py38h2531618_0 defaults
numexpr 2.7.3 py38hb2eb853_0 defaults
numpy 1.19.2 py38h54aff64_0 defaults
numpy-base 1.19.2 py38hfa32c7d_0 defaults
oauthlib 3.0.1 py_0 conda-forge
olefile 0.46 pyhd3eb1b0_0 defaults
openjpeg 2.4.0 h3ad879b_0 defaults
openssl 1.1.1l h7f8727e_0 defaults
opt_einsum 3.3.0 pyhd3eb1b0_1 defaults
owlrl 5.2.3 pypi_0 pypi
packaging 20.8 pyhd3deb0d_0 conda-forge
pamela 1.0.0 py_0 conda-forge
pandas 1.3.2 py38h8c16a72_0 defaults
pandoc 2.11.3.2 h7f98852_0 conda-forge
pandocfilters 1.4.2 py_1 conda-forge
papermill 2.3.1 pypi_0 pypi
parso 0.7.1 pyh9f0ad1d_0 conda-forge
partd 1.2.0 pyhd3eb1b0_0 defaults
pathspec 0.9.0 pypi_0 pypi
patool 1.12 pypi_0 pypi
pdbufr 0.9.0 pypi_0 pypi
pexpect 4.8.0 pyh9f0ad1d_2 conda-forge
pickleshare 0.7.5 py_1003 conda-forge
pillow 8.3.1 py38h2c7a002_0 defaults
pip 21.0.1 pypi_0 pypi
pipx 0.16.1.0 pypi_0 pypi
pluggy 0.13.1 pypi_0 pypi
portalocker 2.3.2 pypi_0 pypi
powerline-shell 0.7.0 pypi_0 pypi
prometheus_client 0.9.0 pyhd3deb0d_0 conda-forge
prompt-toolkit 3.0.10 pyha770c72_0 conda-forge
properscoring 0.1 py_0 conda-forge
protobuf 3.17.2 py38h295c915_0 defaults
prov 1.5.1 pypi_0 pypi
psutil 5.8.0 py38h27cfd23_1 defaults
ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge
pyasn1 0.4.8 pyhd3eb1b0_0 defaults
pyasn1-modules 0.2.8 py_0 defaults
pycosat 0.6.3 py38h497a2fe_1006 conda-forge
pycparser 2.20 pyh9f0ad1d_2 conda-forge
pycurl 7.43.0.6 py38h996a351_1 conda-forge
pydap 3.2.2 pyh9f0ad1d_1001 conda-forge
pydot 1.4.2 pypi_0 pypi
pygments 2.10.0 pypi_0 pypi
pyjwt 2.1.0 pypi_0 pypi
pyld 2.0.3 pypi_0 pypi
pyodc 1.1.1 pypi_0 pypi
pyopenssl 20.0.1 pyhd8ed1ab_0 conda-forge
pyparsing 2.4.7 pyh9f0ad1d_0 conda-forge
pyrsistent 0.17.3 py38h497a2fe_2 conda-forge
pyshacl 0.17.0.post1 pypi_0 pypi
pysocks 1.7.1 py38h578d9bd_3 conda-forge
python 3.8.6 hffdb5ce_4_cpython conda-forge
python-dateutil 2.8.1 py_0 conda-forge
python-eccodes 2021.03.0 py38hb5d20a5_1 conda-forge
python-editor 1.0.4 pypi_0 pypi
python-flatbuffers 1.12 pyhd3eb1b0_0 defaults
python-json-logger 2.0.1 pyh9f0ad1d_0 conda-forge
python-snappy 0.6.0 py38h2531618_3 defaults
python_abi 3.8 1_cp38 conda-forge
pytorch 1.8.1 cpu_py38h60491be_0 defaults
pytz 2021.1 pyhd3eb1b0_0 defaults
pyyaml 5.4.1 pypi_0 pypi
pyzmq 21.0.1 py38h3d7ac18_0 conda-forge
rdflib 6.0.1 pypi_0 pypi
rdflib-jsonld 0.5.0 pypi_0 pypi
readline 8.0 he28a2e2_2 conda-forge
regex 2021.4.4 pypi_0 pypi
renku 0.16.2 pypi_0 pypi
requests 2.24.0 pypi_0 pypi
requests-oauthlib 1.3.0 py_0 defaults
rich 10.3.0 pypi_0 pypi
rsa 4.7.2 pyhd3eb1b0_1 defaults
ruamel-yaml 0.16.5 pypi_0 pypi
ruamel.yaml.clib 0.2.2 py38h497a2fe_2 conda-forge
ruamel_yaml 0.15.80 py38h497a2fe_1003 conda-forge
s3fs 2021.7.0 pyhd3eb1b0_0 defaults
schema-salad 8.2.20210918131710 pypi_0 pypi
scikit-learn 0.24.2 py38ha9443f7_0 defaults
scipy 1.7.0 py38h7b17777_1 conda-forge
send2trash 1.5.0 py_0 conda-forge
setuptools 58.2.0 pypi_0 pypi
setuptools-scm 6.0.1 pypi_0 pypi
shellescape 3.8.1 pypi_0 pypi
shellingham 1.4.0 pypi_0 pypi
simpervisor 0.4 pypi_0 pypi
six 1.16.0 pypi_0 pypi
smmap 4.0.0 pypi_0 pypi
snappy 1.1.8 he6710b0_0 defaults
sortedcontainers 2.4.0 pyhd3eb1b0_0 defaults
soupsieve 2.2.1 pyhd3eb1b0_0 defaults
sqlalchemy 1.3.22 py38h497a2fe_1 conda-forge
sqlite 3.34.0 h74cdb3f_0 conda-forge
sysroot_linux-64 2.12 h77966d4_13 conda-forge
tabulate 0.8.9 pypi_0 pypi
tbb 2020.3 hfd86e86_0 defaults
tblib 1.7.0 pyhd3eb1b0_0 defaults
tenacity 7.0.0 pypi_0 pypi
tensorboard 2.4.0 pyhc547734_0 defaults
tensorboard-plugin-wit 1.6.0 py_0 defaults
tensorflow 2.4.1 mkl_py38hb2083e0_0 defaults
tensorflow-base 2.4.1 mkl_py38h43e0292_0 defaults
tensorflow-estimator 2.6.0 pyh7b7c402_0 defaults
termcolor 1.1.0 py38h06a4308_1 defaults
terminado 0.9.2 py38h578d9bd_0 conda-forge
testpath 0.4.4 py_0 conda-forge
textwrap3 0.9.2 pypi_0 pypi
threadpoolctl 2.2.0 pyh0d69192_0 defaults
tini 0.18.0 h14c3975_1001 conda-forge
tk 8.6.10 h21135ba_1 conda-forge
toml 0.10.2 pypi_0 pypi
toolz 0.11.1 pyhd3eb1b0_0 defaults
tornado 6.1 py38h497a2fe_1 conda-forge
tqdm 4.60.0 pypi_0 pypi
traitlets 5.0.5 py_0 conda-forge
typed-ast 1.4.2 pypi_0 pypi
typing-extensions 3.7.4.3 pypi_0 pypi
typing_extensions 3.10.0.2 pyh06a4308_0 defaults
urllib3 1.25.11 pypi_0 pypi
userpath 1.4.2 pypi_0 pypi
wcmatch 8.2 pypi_0 pypi
wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge
webencodings 0.5.1 py_1 conda-forge
webob 1.8.7 pyhd3eb1b0_0 defaults
werkzeug 2.0.1 pyhd3eb1b0_0 defaults
wheel 0.36.2 pyhd3deb0d_0 conda-forge
wrapt 1.12.1 py38h7b6447c_1 defaults
xarray 0.19.0 pyhd3eb1b0_1 defaults
xhistogram 0.3.0 pyhd8ed1ab_0 conda-forge
xskillscore 0.0.23 pyhd8ed1ab_0 conda-forge
xz 5.2.5 h516909a_1 conda-forge
yagup 0.1.1 pypi_0 pypi
yaml 0.2.5 h516909a_0 conda-forge
yarl 1.6.3 py38h27cfd23_0 defaults
zarr 2.8.1 pyhd3eb1b0_0 defaults
zeromq 4.3.3 h58526e2_3 conda-forge
zict 2.0.0 pyhd3eb1b0_0 defaults
zipp 3.4.0 py_0 conda-forge
zlib 1.2.11 h516909a_1010 conda-forge
zstd 1.4.9 haebb681_0 defaults
%% Cell type:code id: tags:
``` python
```
......
source diff could not be displayed: it is too large. Options to address this: view the blob.
plugins:
source:
- module: intake_xarray
sources:
training-input:
description: climetlab name in AI/ML community naming for hindcasts as input to the ML-model in training period
driver: netcdf
parameters:
model:
description: name of the S2S model
type: str
default: ecmwf
allowed: [ecmwf, eccc, ncep]
param:
description: variable name
type: str
default: tp
allowed: [t2m, ci, gh, lsm, msl, q, rsn, sm100, sm20, sp, sst, st100, st20, t, tcc, tcw, ttr, tp, v, u]
date:
description: initialization weekly thursdays
type: datetime
default: 2020.01.02
min: 2020.01.02
max: 2020.12.31
version:
description: versioning of the data
type: str
default: 0.3.0
format:
description: data type
type: str
default: netcdf
allowed: [netcdf, grib]
ending:
description: data format compatible with format; netcdf -> nc, grib -> grib
type: str
default: nc
allowed: [nc, grib]
xarray_kwargs:
engine: h5netcdf
args: # add simplecache:: for caching: https://filesystem-spec.readthedocs.io/en/latest/features.html#caching-files-locally
urlpath: https://storage.ecmwf.europeanweather.cloud/s2s-ai-challenge/data/training-input/{{version}}/{{format}}/{{model}}-hindcast-{{param}}-{{date.strftime("%Y%m%d")}}.{{ending}}
test-input:
description: climetlab name in AI/ML community naming for 2020 forecasts as input to ML model in test period 2020
driver: netcdf
parameters:
model:
description: name of the S2S model
type: str
default: ecmwf
allowed: [ecmwf, eccc, ncep]
param:
description: variable name
type: str
default: tp
allowed: [t2m, ci, gh, lsm, msl, q, rsn, sm100, sm20, sp, sst, st100, st20, t, tcc, tcw, ttr, tp, v, u]
date:
description: initialization weekly thursdays
type: datetime
default: 2020.01.02
min: 2020.01.02
max: 2020.12.31
version:
description: versioning of the data
type: str
default: 0.3.0
format:
description: data type
type: str
default: netcdf
allowed: [netcdf, grib]
ending:
description: data format compatible with format; netcdf -> nc, grib -> grib
type: str
default: nc
allowed: [nc, grib]
xarray_kwargs:
engine: h5netcdf
args: # add simplecache:: for caching: https://filesystem-spec.readthedocs.io/en/latest/features.html#caching-files-locally
urlpath: https://storage.ecmwf.europeanweather.cloud/s2s-ai-challenge/data/test-input/{{version}}/{{format}}/{{model}}-forecast-{{param}}-{{date.strftime("%Y%m%d")}}.{{ending}}
training-output-reference:
description: climetlab name in AI/ML community naming for 2020 forecasts as output reference to compare to ML model output to in training period
driver: netcdf
parameters:
param:
description: variable name
type: str
default: tp
allowed: [t2m, tp]
date:
description: initialization weekly thursdays
type: datetime
default: 2020.01.02
min: 2020.01.02
max: 2020.12.31
xarray_kwargs:
engine: h5netcdf
args: # add simplecache:: for caching: https://filesystem-spec.readthedocs.io/en/latest/features.html#caching-files-locally
urlpath: https://storage.ecmwf.europeanweather.cloud/s2s-ai-challenge/data/test-output-reference/{{param}}-{{date.strftime("%Y%m%d")}}.nc
test-output-reference:
description: climetlab name in AI/ML community naming for 2020 forecasts as output reference to compare to ML model output to in test period 2020
driver: netcdf
parameters:
param:
description: variable name
type: str
default: tp
allowed: [t2m, tp]
date:
description: initialization weekly thursdays
type: datetime
default: 2020.01.02
min: 2020.01.02
max: 2020.12.31
xarray_kwargs:
engine: h5netcdf
args: # add simplecache:: for caching: https://filesystem-spec.readthedocs.io/en/latest/features.html#caching-files-locally
urlpath: https://storage.ecmwf.europeanweather.cloud/s2s-ai-challenge/data/test-output-reference/{{param}}-{{date.strftime("%Y%m%d")}}.nc
source diff could not be displayed: it is too large. Options to address this: view the blob.
# Data Access
- European Weather Cloud:
- [`climetlab-s2s-ai-challenge`](https://github.com/ecmwf-lab/climetlab-s2s-ai-challenge)
- `wget`: wget_curl.ipynb
- `curl`: wget_curl.ipynb
- `mouse`: wget_curl.ipynb
- `intake`: intake.ipynb
- [IRI Data Library](iridl.ldeo.columbia.edu/): IRIDL.ipynb
- S2S: http://iridl.ldeo.columbia.edu/SOURCES/.ECMWF/.S2S/ (restricted access explained in IRIDL.ipynb)
- SubX: http://iridl.ldeo.columbia.edu/SOURCES/.Models/.SubX/
- NMME: http://iridl.ldeo.columbia.edu/SOURCES/.Models/.NMME/
- s2sprediction.net
plugins:
source:
- module: intake_xarray
sources:
training-input:
description: S2S hindcasts from IRIDL regridded to 1.5 deg grid and aggregated by mean over lead, https://iridl.ldeo.columbia.edu/SOURCES/.ECMWF/.S2S/overview.html
driver: opendap
parameters:
center:
description: name of the center issuing the hindcast
type: str
default: ECMF
allowed: [BOM, CNRM, ECCC, ECMF, HMCR, ISAC, JMA, KMA, NCEP, UKMO]
grid:
description: regrid to this global resolution
type: float
default: 1.5
lead_name:
description: name of the lead_time dimension
type: str
default: LA
allowed: [LA, L]
lead_start:
description: aggregation start lead passed to RANGEEDGES
type: int
default: 14
lead_end:
description: aggregation end lead passed to RANGEEDGES
type: int
default: 27
experiment_type:
description: type of experiment
type: str
default: perturbed
allowed: [control, perturbed, RMMS]
group:
description: group of variables
type: str
default: 2m_above_ground
#allowed: [2m_above_ground, ...] see https://iridl.ldeo.columbia.edu/SOURCES/.ECMWF/.S2S/.ECMF/.reforecast/.perturbed/
param:
description: variable name
type: str
default: 2t
#allowed: [2t] see https://iridl.ldeo.columbia.edu/SOURCES/.ECMWF/.S2S/.ECMF/.reforecast/.perturbed/
xarray_kwargs:
engine: netcdf4
args:
urlpath: http://iridl.ldeo.columbia.edu/SOURCES/.ECMWF/.S2S/.{{center}}/.reforecast/.{{experiment_type}}/.{{group}}/{{param}}/{{lead_name}}/({{lead_start}})/({{lead_end}})/RANGEEDGES/[{{lead_name}}]average/X/0/{{grid}}/358.5/GRID/Y/90/{{grid}}/-90/GRID/dods
test-input:
description: S2S forecasts from IRIDL regridded to 1.5 deg grid and aggregated by mean over lead, https://iridl.ldeo.columbia.edu/SOURCES/.ECMWF/.S2S/overview.html
driver: opendap
parameters:
center:
description: name of the center issuing the hindcast
type: str
default: ECMF
allowed: ['BOM','CNRM','ECCC','ECMF','HMCR','ISAC','JMA','KMA','NCEP','UKMO']
grid:
description: regrid to this global resolution
type: float
default: 1.5
lead_name:
description: name of the lead_time dimension
type: str
default: LA
allowed: [LA, L, L1]
lead_start:
description: aggregation start lead passed to RANGEEDGES
type: int
default: 14
lead_end:
description: aggregation end lead passed to RANGEEDGES
type: int
default: 27
experiment_type:
description: type of experiment
type: str
default: perturbed
allowed: [control, perturbed, RMMS]
group:
description: group of variables
type: str
default: 2m_above_ground
#allowed: see https://iridl.ldeo.columbia.edu/SOURCES/.ECMWF/.S2S/.ECMF/.reforecast/.perturbed/
param:
description: variable name
type: str
default: 2t
#allowed: [2t] see https://iridl.ldeo.columbia.edu/SOURCES/.ECMWF/.S2S/.ECMF/.reforecast/.perturbed/
xarray_kwargs:
engine: netcdf4
args:
urlpath: http://iridl.ldeo.columbia.edu/SOURCES/.ECMWF/.S2S/.{{center}}/.forecast/.{{experiment_type}}/.{{group}}/{{param}}/S/(0000%201%20Jan%202020)/(0000%2031%20Dec%202020)/RANGEEDGES/{{lead_name}}/({{lead_start}})/({{lead_end}})/RANGEEDGES/[{{lead_name}}]average/X/0/{{grid}}/358.5/GRID/Y/90/{{grid}}/-90/GRID/dods
plugins:
source:
- module: intake_xarray
sources:
training-input:
description: SubX hindcasts from IRIDL regridded to 1.5 deg grid and aggregated by mean over lead, http://iridl.ldeo.columbia.edu/SOURCES/.Models/.SubX/outline.html
driver: opendap
parameters:
center:
description: name of the center issuing the hindcast
type: str
default: EMC
allowed: [CESM, ECCC, EMC, ESRL, GMAO, NCEP, NRL, RSMAS]
model:
description: name of the model
type: str
default: GEFS
allowed: [30LCESM1, 46LCESM1, GEM, GEPS6, GEPS5, GEFS, GEFSv12, FIMr1p1, GEOS_V2p1, CFSv2, NESM, CCSM4]
grid:
description: regrid to this global resolution
type: float
default: 1.5
lead_start:
description: aggregation start lead passed to RANGEEDGES
type: int
default: 14
lead_end:
description: aggregation end lead passed to RANGEEDGES
type: int
default: 27
param:
description: variable name
type: str
default: pr
#allowed: [pr]
xarray_kwargs:
engine: netcdf4
args:
urlpath: http://iridl.ldeo.columbia.edu/SOURCES/.Models/.SubX/.{{center}}/.{{model}}/.hindcast/.{{param}}/L/({{lead_start}})/({{lead_end}})/RANGEEDGES/[L]average/X/0/{{grid}}/358.5/GRID/Y/90/{{grid}}/-90/GRID/dods
test-input:
description: SubX forecasts from IRIDL regridded to 1.5 deg grid and aggregated by mean over lead, http://iridl.ldeo.columbia.edu/SOURCES/.Models/.SubX/outline.html
driver: opendap
parameters:
center:
description: name of the center issuing the forecast
type: str
default: EMC
allowed: [CESM, ECCC, EMC, ESRL, GMAO, NCEP, NRL, RSMAS]
model:
description: name of the model
type: str
default: GEFS
allowed: [30LCESM1, 46LCESM1, GEM, GEPS6, GEPS5, GEFS, GEFSv12, FIMr1p1, GEOS_V2p1, CFSv2, NESM, CCSM4]
grid:
description: regrid to this global resolution
type: float
default: 1.5
lead_start:
description: aggregation start lead passed to RANGEEDGES
type: int
default: 14
lead_end:
description: aggregation end lead passed to RANGEEDGES
type: int
default: 27
param:
description: variable name, see http://iridl.ldeo.columbia.edu/SOURCES/.Models/.SubX/outline.html
type: str
default: pr
#allowed: [pr]
xarray_kwargs:
engine: netcdf4
args:
urlpath: http://iridl.ldeo.columbia.edu/SOURCES/.Models/.SubX/.{{center}}/.{{model}}/.forecast/.{{param}}/S/(0000%201%20Jan%202020)/(0000%2031%20Dec%202020)/RANGEEDGES/L/({{lead_start}})/({{lead_end}})/RANGEEDGES/[L]average/X/0/{{grid}}/358.5/GRID/Y/90/{{grid}}/-90/GRID/dods
%% Cell type:markdown id: tags:
# Data Access from EWC via `intake`
Data easily available via `climetlab`: https://github.com/ecmwf-lab/climetlab-s2s-ai-challenge
Data holdings listed: https://storage.ecmwf.europeanweather.cloud/s2s-ai-challenge/data/test-input/0.3.0/netcdf/index.html
Therefore, S3 data also accessible with `intake-xarray` and cachable with `fsspec`.
%% Cell type:code id: tags:
``` python
import intake
import fsspec
import xarray as xr
import os, glob
import pandas as pd
xr.set_options(display_style='text')
```
%% Output
/opt/conda/lib/python3.8/site-packages/xarray/backends/cfgrib_.py:27: UserWarning: Failed to load cfgrib - most likely there is a problem accessing the ecCodes library. Try `import cfgrib` to get the full error message
warnings.warn(
<xarray.core.options.set_options at 0x7fa0100dcdc0>
%% Cell type:code id: tags:
``` python
# prevent aihttp timeout errors
from aiohttp import ClientSession, ClientTimeout
timeout = ClientTimeout(total=600)
fsspec.config.conf['https'] = dict(client_kwargs={'timeout': timeout})
```
%% Cell type:markdown id: tags:
# intake
https://github.com/intake/intake-xarray can read and cache `grib` and `netcdf` from catalogs.
Caching via `fsspec`: https://filesystem-spec.readthedocs.io/en/latest/features.html#caching-files-locally
%% Cell type:code id: tags:
``` python
import intake_xarray
cache_path = '/work/s2s-ai-challenge-template/data/cache'
fsspec.config.conf['simplecache'] = {'cache_storage': cache_path, 'same_names':True}
```
%% Cell type:code id: tags:
``` python
%%writefile EWC_catalog.yml
plugins:
source:
- module: intake_xarray
sources:
training-input:
description: climetlab name in AI/ML community naming for hindcasts as input to the ML-model in training period
driver: netcdf
parameters:
model:
description: name of the S2S model
type: str
default: ecmwf
allowed: [ecmwf, eccc, ncep]
param:
description: variable name
type: str
default: tp
allowed: [t2m, ci, gh, lsm, msl, q, rsn, sm100, sm20, sp, sst, st100, st20, t, tcc, tcw, ttr, tp, v, u]
date:
description: initialization weekly thursdays
type: datetime
default: 2020.01.02
min: 2020.01.02
max: 2020.12.31
version:
description: versioning of the data
type: str
default: 0.3.0
format:
description: data type
type: str
default: netcdf
allowed: [netcdf, grib]
ending:
description: data format compatible with format; netcdf -> nc, grib -> grib
type: str
default: nc
allowed: [nc, grib]
xarray_kwargs:
engine: h5netcdf
args: # add simplecache:: for caching: https://filesystem-spec.readthedocs.io/en/latest/features.html#caching-files-locally
urlpath: https://storage.ecmwf.europeanweather.cloud/s2s-ai-challenge/data/training-input/{{version}}/{{format}}/{{model}}-hindcast-{{param}}-{{date.strftime("%Y%m%d")}}.{{ending}}
test-input:
description: climetlab name in AI/ML community naming for 2020 forecasts as input to ML model in test period 2020
driver: netcdf
parameters:
model:
description: name of the S2S model
type: str
default: ecmwf
allowed: [ecmwf, eccc, ncep]
param:
description: variable name
type: str
default: tp
allowed: [t2m, ci, gh, lsm, msl, q, rsn, sm100, sm20, sp, sst, st100, st20, t, tcc, tcw, ttr, tp, v, u]
date:
description: initialization weekly thursdays
type: datetime
default: 2020.01.02
min: 2020.01.02
max: 2020.12.31
version:
description: versioning of the data
type: str
default: 0.3.0
format:
description: data type
type: str
default: netcdf
allowed: [netcdf, grib]
ending:
description: data format compatible with format; netcdf -> nc, grib -> grib
type: str
default: nc
allowed: [nc, grib]
xarray_kwargs:
engine: h5netcdf
args: # add simplecache:: for caching: https://filesystem-spec.readthedocs.io/en/latest/features.html#caching-files-locally
urlpath: https://storage.ecmwf.europeanweather.cloud/s2s-ai-challenge/data/test-input/{{version}}/{{format}}/{{model}}-forecast-{{param}}-{{date.strftime("%Y%m%d")}}.{{ending}}
training-output-reference:
description: climetlab name in AI/ML community naming for 2020 forecasts as output reference to compare to ML model output to in training period
driver: netcdf
parameters:
param:
description: variable name
type: str
default: tp
allowed: [t2m, ci, gh, lsm, msl, q, rsn, sm100, sm20, sp, sst, st100, st20, t, tcc, tcw, ttr, tp, v, u]
date:
description: initialization weekly thursdays
type: datetime
default: 2020.01.02
min: 2020.01.02
max: 2020.12.31
xarray_kwargs:
engine: h5netcdf
args: # add simplecache:: for caching: https://filesystem-spec.readthedocs.io/en/latest/features.html#caching-files-locally
urlpath: https://storage.ecmwf.europeanweather.cloud/s2s-ai-challenge/data/test-output-reference/{{param}}-{{date.strftime("%Y%m%d")}}.nc
test-output-reference:
description: climetlab name in AI/ML community naming for 2020 forecasts as output reference to compare to ML model output to in test period 2020
driver: netcdf
parameters:
param:
description: variable name
type: str
default: tp
allowed: [t2m, ci, gh, lsm, msl, q, rsn, sm100, sm20, sp, sst, st100, st20, t, tcc, tcw, ttr, tp, v, u]
date:
description: initialization weekly thursdays
type: datetime
default: 2020.01.02
min: 2020.01.02
max: 2020.12.31
xarray_kwargs:
engine: h5netcdf
args: # add simplecache:: for caching: https://filesystem-spec.readthedocs.io/en/latest/features.html#caching-files-locally
urlpath: https://storage.ecmwf.europeanweather.cloud/s2s-ai-challenge/data/test-output-reference/{{param}}-{{date.strftime("%Y%m%d")}}.nc
```
%% Output
Writing EWC_catalog.yml
%% Cell type:code id: tags:
``` python
cat = intake.open_catalog('EWC_catalog.yml')
```
%% Cell type:code id: tags:
``` python
# dates for 2020 forecasts and their on-the-fly reforecasts
dates=pd.date_range(start='2020-01-02',freq='7D',end='2020-12-31')
dates
```
%% Output
DatetimeIndex(['2020-01-02', '2020-01-09', '2020-01-16', '2020-01-23',
'2020-01-30', '2020-02-06', '2020-02-13', '2020-02-20',
'2020-02-27', '2020-03-05', '2020-03-12', '2020-03-19',
'2020-03-26', '2020-04-02', '2020-04-09', '2020-04-16',
'2020-04-23', '2020-04-30', '2020-05-07', '2020-05-14',
'2020-05-21', '2020-05-28', '2020-06-04', '2020-06-11',
'2020-06-18', '2020-06-25', '2020-07-02', '2020-07-09',
'2020-07-16', '2020-07-23', '2020-07-30', '2020-08-06',
'2020-08-13', '2020-08-20', '2020-08-27', '2020-09-03',
'2020-09-10', '2020-09-17', '2020-09-24', '2020-10-01',
'2020-10-08', '2020-10-15', '2020-10-22', '2020-10-29',
'2020-11-05', '2020-11-12', '2020-11-19', '2020-11-26',
'2020-12-03', '2020-12-10', '2020-12-17', '2020-12-24',
'2020-12-31'],
dtype='datetime64[ns]', freq='7D')
%% Cell type:markdown id: tags:
# `hindcast-input`
on-the-fly hindcasts corresponding to the 2020 forecasts
%% Cell type:code id: tags:
``` python
cat['training-input'](date=dates[10], param='tp', model='eccc').to_dask()
```
%% Output
/opt/conda/lib/python3.8/site-packages/xarray/backends/plugins.py:61: RuntimeWarning: Engine 'cfgrib' loading failed:
/opt/conda/lib/python3.8/site-packages/gribapi/_bindings.cpython-38-x86_64-linux-gnu.so: undefined symbol: codes_bufr_key_is_header
warnings.warn(f"Engine {name!r} loading failed:\n{ex}", RuntimeWarning)
<xarray.Dataset>
Dimensions: (forecast_time: 20, latitude: 121, lead_time: 32, longitude: 240, realization: 4)
Coordinates:
* realization (realization) int64 0 1 2 3
* forecast_time (forecast_time) datetime64[ns] 1998-03-12 ... 2017-03-12
* lead_time (lead_time) timedelta64[ns] 1 days 2 days ... 31 days 32 days
* latitude (latitude) float64 90.0 88.5 87.0 85.5 ... -87.0 -88.5 -90.0
* longitude (longitude) float64 0.0 1.5 3.0 4.5 ... 355.5 357.0 358.5
valid_time (forecast_time, lead_time) datetime64[ns] ...
Data variables:
tp (realization, forecast_time, lead_time, latitude, longitude) float32 ...
Attributes:
GRIB_edition: [2]
GRIB_centre: cwao
GRIB_centreDescription: Canadian Meteorological Service - Montreal
GRIB_subCentre: [0]
Conventions: CF-1.7
institution: Canadian Meteorological Service - Montreal
history: 2021-05-11T10:03 GRIB to CDM+CF via cfgrib-0.9.9...
%% Cell type:markdown id: tags:
# `forecast-input`
2020
%% Cell type:code id: tags:
``` python
cat['test-input'](date=dates[10], param='t2m', model='ecmwf').to_dask()
```
%% Output
<xarray.Dataset>
Dimensions: (forecast_time: 1, latitude: 121, lead_time: 46, longitude: 240, realization: 51)
Coordinates:
* realization (realization) int64 0 1 2 3 4 5 6 7 ... 44 45 46 47 48 49 50
* forecast_time (forecast_time) datetime64[ns] 2020-03-12
* lead_time (lead_time) timedelta64[ns] 1 days 2 days ... 45 days 46 days
* latitude (latitude) float64 90.0 88.5 87.0 85.5 ... -87.0 -88.5 -90.0
* longitude (longitude) float64 0.0 1.5 3.0 4.5 ... 355.5 357.0 358.5
valid_time (forecast_time, lead_time) datetime64[ns] ...
Data variables:
t2m (realization, forecast_time, lead_time, latitude, longitude) float32 ...
Attributes:
GRIB_edition: [2]
GRIB_centre: ecmf
GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts
GRIB_subCentre: [0]
Conventions: CF-1.7
institution: European Centre for Medium-Range Weather Forecasts
history: 2021-05-10T16:14:36 GRIB to CDM+CF via cfgrib-0....
%% Cell type:markdown id: tags:
# `hindcast-like-observations`
observations matching hindcasts
%% Cell type:code id: tags:
``` python
cat['training-output-reference'](date=dates[10], param='t2m').to_dask()
```
%% Output
<xarray.Dataset>
Dimensions: (forecast_time: 1, latitude: 121, lead_time: 47, longitude: 240)
Coordinates:
valid_time (lead_time, forecast_time) datetime64[ns] ...
* latitude (latitude) float64 90.0 88.5 87.0 85.5 ... -87.0 -88.5 -90.0
* longitude (longitude) float64 0.0 1.5 3.0 4.5 ... 355.5 357.0 358.5
* forecast_time (forecast_time) datetime64[ns] 2020-03-12
* lead_time (lead_time) timedelta64[ns] 0 days 1 days ... 45 days 46 days
Data variables:
t2m (lead_time, forecast_time, latitude, longitude) float32 ...
Attributes:
source_dataset_name: temperature daily from NOAA NCEP CPC: Climate Predi...
source_hosting: IRIDL
source_url: http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCEP/...
created_by_software: climetlab-s2s-ai-challenge
created_by_script: tools/observations/makefile
%% Cell type:markdown id: tags:
# `forecast-like-observations`
observations matching 2020 forecasts
%% Cell type:code id: tags:
``` python
cat['test-output-reference'](date=dates[10], param='t2m').to_dask()
```
%% Output
<xarray.Dataset>
Dimensions: (forecast_time: 1, latitude: 121, lead_time: 47, longitude: 240)
Coordinates:
valid_time (lead_time, forecast_time) datetime64[ns] ...
* latitude (latitude) float64 90.0 88.5 87.0 85.5 ... -87.0 -88.5 -90.0
* longitude (longitude) float64 0.0 1.5 3.0 4.5 ... 355.5 357.0 358.5
* forecast_time (forecast_time) datetime64[ns] 2020-03-12
* lead_time (lead_time) timedelta64[ns] 0 days 1 days ... 45 days 46 days
Data variables:
t2m (lead_time, forecast_time, latitude, longitude) float32 ...
Attributes:
source_dataset_name: temperature daily from NOAA NCEP CPC: Climate Predi...
source_hosting: IRIDL
source_url: http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCEP/...
created_by_software: climetlab-s2s-ai-challenge
created_by_script: tools/observations/makefile
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
# Data Access via `curl` or `wget`
Data easily available via `climetlab`: https://github.com/ecmwf-lab/climetlab-s2s-ai-challenge
Data holdings listed:
- https://storage.ecmwf.europeanweather.cloud/s2s-ai-challenge/data/test-input/0.3.0/netcdf/index.html
- https://storage.ecmwf.europeanweather.cloud/s2s-ai-challenge/data/training-input/0.3.0/netcdf/index.html
- https://storage.ecmwf.europeanweather.cloud/s2s-ai-challenge/data/test-output-reference/index.html
- https://storage.ecmwf.europeanweather.cloud/s2s-ai-challenge/data/training-output-reference/index.html
Therefore, S3 data also accessible with `curl` or `wget`. Alternatively, you can click on the html links and download files by mouse click.
%% Cell type:code id: tags:
``` python
import xarray as xr
import os
from subprocess import call
xr.set_options(display_style='text')
```
%% Output
/opt/conda/lib/python3.8/site-packages/xarray/backends/cfgrib_.py:27: UserWarning: Failed to load cfgrib - most likely there is a problem accessing the ecCodes library. Try `import cfgrib` to get the full error message
warnings.warn(
<xarray.core.options.set_options at 0x7f5170570520>
%% Cell type:code id: tags:
``` python
# version of the EWC data
version = '0.3.0'
```
%% Cell type:markdown id: tags:
# `hindcast-input`
on-the-fly hindcasts corresponding to the 2020 forecasts
%% Cell type:code id: tags:
``` python
parameter = 't2m'
date = '20200102'
model = 'ecmwf'
```
%% Cell type:code id: tags:
``` python
url = f'https://storage.ecmwf.europeanweather.cloud/s2s-ai-challenge/data/training-input/{version}/netcdf/{model}-hindcast-{parameter}-{date}.nc'
os.system(f'wget {url}')
assert os.path.exists(f'{model}-hindcast-{parameter}-{date}.nc')
```
%% Cell type:markdown id: tags:
# `forecast-input`
2020
%% Cell type:code id: tags:
``` python
url = f'https://storage.ecmwf.europeanweather.cloud/s2s-ai-challenge/data/test-input/{version}/netcdf/{model}-forecast-{parameter}-{date}.nc'
os.system(f'wget {url}')
assert os.path.exists(f'{model}-forecast-{parameter}-{date}.nc')
```
%% Cell type:markdown id: tags:
# `hindcast-like-observations`
CPC observations formatted like training period hindcasts
%% Cell type:code id: tags:
``` python
url = f'https://storage.ecmwf.europeanweather.cloud/s2s-ai-challenge/data/training-output-reference/{parameter}-{date}.nc'
os.system(f'wget {url}')
assert os.path.exists(f'{parameter}-{date}.nc')
```
%% Cell type:markdown id: tags:
# `forecast-like-observations`
CPC observations formatted like test period 2020 forecasts
%% Cell type:code id: tags:
``` python
url = f'https://storage.ecmwf.europeanweather.cloud/s2s-ai-challenge/data/test-output-reference/{parameter}-{date}.nc'
os.system(f'wget {url}')
assert os.path.exists(f'{parameter}-{date}.nc')
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
# Train ML model to correct predictions of week 3-4 & 5-6
This notebook create a Machine Learning `ML_model` to predict weeks 3-4 & 5-6 based on `S2S` weeks 3-4 & 5-6 forecasts and is compared to `CPC` observations for the [`s2s-ai-challenge`](https://s2s-ai-challenge.github.io/).
%% Cell type:markdown id: tags:
# Synopsis
%% Cell type:markdown id: tags:
## Method: `mean bias reduction`
- calculate the mean bias from 2000-2019 deterministic ensemble mean forecast
- remove that mean bias from 2020 forecast deterministic ensemble mean forecast
- no Machine Learning used here
%% Cell type:markdown id: tags:
## Data used
type: renku datasets
Training-input for Machine Learning model:
- hindcasts of models:
- ECMWF: `ecmwf_hindcast-input_2000-2019_biweekly_deterministic.zarr`
Forecast-input for Machine Learning model:
- real-time 2020 forecasts of models:
- ECMWF: `ecmwf_forecast-input_2020_biweekly_deterministic.zarr`
Compare Machine Learning model forecast against against ground truth:
- `CPC` observations:
- `hindcast-like-observations_biweekly_deterministic.zarr`
- `forecast-like-observations_2020_biweekly_deterministic.zarr`
%% Cell type:markdown id: tags:
## Resources used
for training, details in reproducibility
- platform: MPI-M supercompute 1 Node
- memory: 64 GB
- processors: 36 CPU
- storage required: 10 GB
%% Cell type:markdown id: tags:
## Safeguards
All points have to be [x] checked. If not, your submission is invalid.
Changes to the code after submissions are not possible, as the `commit` before the `tag` will be reviewed.
(Only in exceptions and if previous effort in reproducibility can be found, it may be allowed to improve readability and reproducibility after November 1st 2021.)
%% Cell type:markdown id: tags:
### Safeguards to prevent [overfitting](https://en.wikipedia.org/wiki/Overfitting?wprov=sfti1)
If the organizers suspect overfitting, your contribution can be disqualified.
- [x] We didnt use 2020 observations in training (explicit overfitting and cheating)
- [x] We didnt repeatedly verify my model on 2020 observations and incrementally improved my RPSS (implicit overfitting)
- [x] We provide RPSS scores for the training period with script `skill_by_year`, see in section 6.3 `predict`.
- [x] We tried our best to prevent [data leakage](https://en.wikipedia.org/wiki/Leakage_(machine_learning)?wprov=sfti1).
- [x] We honor the `train-validate-test` [split principle](https://en.wikipedia.org/wiki/Training,_validation,_and_test_sets). This means that the hindcast data is split into `train` and `validate`, whereas `test` is withheld.
- [x] We did use `test` explicitly in training or implicitly in incrementally adjusting parameters.
- [x] We considered [cross-validation](https://en.wikipedia.org/wiki/Cross-validation_(statistics)).
%% Cell type:markdown id: tags:
### Safeguards for Reproducibility
Notebook/code must be independently reproducible from scratch by the organizers (after the competition), if not possible: no prize
- [x] All training data is publicly available (no pre-trained private neural networks, as they are not reproducible for us)
- [x] Code is well documented, readable and reproducible.
- [x] Code to reproduce training and predictions should run within a day on the described architecture. If the training takes longer than a day, please justify why this is needed. Please do not submit training piplelines, which take weeks to train.
- [x] Code to reproduce training and predictions is preferred to run within a day on the described architecture. If the training takes longer than a day, please justify why this is needed. Please do not submit training piplelines, which take weeks to train.
%% Cell type:markdown id: tags:
# Imports
%% Cell type:code id: tags:
``` python
import xarray as xr
xr.set_options(display_style='text')
import numpy as np
from dask.utils import format_bytes
import xskillscore as xs
```
%% Output
<xarray.core.options.set_options at 0x7f05cc486340>
%% Cell type:markdown id: tags:
# Get training data
preprocessing of input data may be done in separate notebook/script
%% Cell type:markdown id: tags:
## Hindcast
get weekly initialized hindcasts
%% Cell type:code id: tags:
``` python
# preprocessed as renku dataset
!renku storage pull ../data/ecmwf_hindcast-input_2000-2019_biweekly_deterministic.zarr
```
%% Output
Warning: Run CLI commands only from project's root directory.

%% Cell type:code id: tags:
``` python
hind_2000_2019 = xr.open_zarr("../data/ecmwf_hindcast-input_2000-2019_biweekly_deterministic.zarr", consolidated=True)
```
%% Cell type:code id: tags:
``` python
# preprocessed as renku dataset
!renku storage pull ../data/ecmwf_forecast-input_2020_biweekly_deterministic.zarr
```
%% Output
Warning: Run CLI commands only from project's root directory.

%% Cell type:code id: tags:
``` python
fct_2020 = xr.open_zarr("../data/ecmwf_forecast-input_2020_biweekly_deterministic.zarr", consolidated=True)
```
%% Cell type:markdown id: tags:
## Observations
corresponding to hindcasts
%% Cell type:code id: tags:
``` python
# preprocessed as renku dataset
!renku storage pull ../data/hindcast-like-observations_2000-2019_biweekly_deterministic.zarr
```
%% Output
Warning: Run CLI commands only from project's root directory.

%% Cell type:code id: tags:
``` python
obs_2000_2019 = xr.open_zarr("../data/hindcast-like-observations_2000-2019_biweekly_deterministic.zarr", consolidated=True)
```
%% Cell type:code id: tags:
``` python
# preprocessed as renku dataset
!renku storage pull ../data/forecast-like-observations_2020_biweekly_deterministic.zarr
```
%% Output
Warning: Run CLI commands only from project's root directory.

%% Cell type:code id: tags:
``` python
obs_2020 = xr.open_zarr("../data/forecast-like-observations_2020_biweekly_deterministic.zarr", consolidated=True)
```
%% Cell type:markdown id: tags:
# no ML model
%% Cell type:markdown id: tags:
Here, we just remove the mean bias from the ensemble mean forecast.
%% Cell type:code id: tags:
``` python
bias_2000_2019 = (hind_2000_2019.mean('realization') - obs_2000_2019).groupby('forecast_time.weekofyear').mean().compute()
from scripts import add_year_week_coords
obs_2000_2019 = add_year_week_coords(obs_2000_2019)
hind_2000_2019 = add_year_week_coords(hind_2000_2019)
```
%% Cell type:code id: tags:
``` python
bias_2000_2019 = (hind_2000_2019.mean('realization') - obs_2000_2019).groupby('week').mean().compute()
```
%% Output
/work/mh0727/m300524/conda-envs/s2s-ai/lib/python3.7/site-packages/xarray/core/accessor_dt.py:381: FutureWarning: dt.weekofyear and dt.week have been deprecated. Please use dt.isocalendar().week instead.
FutureWarning,
/work/mh0727/m300524/conda-envs/s2s-ai/lib/python3.7/site-packages/dask/array/numpy_compat.py:40: RuntimeWarning: invalid value encountered in true_divide
/opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
x = np.divide(x1, x2, out)
%% Cell type:markdown id: tags:
## `predict`
Create predictions and print `mean(variable, lead_time, longitude, weighted latitude)` RPSS for all years as calculated by `skill_by_year`. For now RPS, todo: change to RPSS.
Create predictions and print `mean(variable, lead_time, longitude, weighted latitude)` RPSS for all years as calculated by `skill_by_year`.
%% Cell type:code id: tags:
``` python
from scripts import make_probabilistic
```
%% Cell type:code id: tags:
``` python
!renku storage pull ../data/hindcast-like-observations_2000-2019_biweekly_tercile-edges.nc
```
%% Output
Warning: Run CLI commands only from project's root directory.

%% Cell type:code id: tags:
``` python
cache_path='../data'
tercile_file = f'{cache_path}/hindcast-like-observations_2000-2019_biweekly_tercile-edges.nc'
tercile_file = f'../data/hindcast-like-observations_2000-2019_biweekly_tercile-edges.nc'
tercile_edges = xr.open_dataset(tercile_file)
```
%% Cell type:code id: tags:
``` python
# this is not useful but results have expected dimensions
# actually train for each lead_time
def create_predictions(fct, bias):
preds = fct - bias.sel(weekofyear=fct.forecast_time.dt.weekofyear)
if 'week' not in fct.coords:
fct = add_year_week_coords(fct)
preds = fct - bias.sel(week=fct.week)
preds = make_probabilistic(preds, tercile_edges)
return preds
return preds.astype('float32')
```
%% Cell type:markdown id: tags:
### `predict` training period in-sample
%% Cell type:code id: tags:
``` python
#!renku storage pull ../data/forecast-like-observations_2020_biweekly_terciled.nc
!renku storage pull ../data/forecast-like-observations_2020_biweekly_terciled.nc
```
%% Cell type:code id: tags:
%% Output
``` python
#!renku storage pull ../data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr
```
Warning: Run CLI commands only from project's root directory.

%% Cell type:code id: tags:
``` python
from scripts import skill_by_year
!renku storage pull ../data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr
```
%% Output
Warning: Run CLI commands only from project's root directory.

%% Cell type:code id: tags:
``` python
preds_is = create_predictions(hind_2000_2019, bias_2000_2019).compute()
```
%% Output
%% Cell type:code id: tags:
/work/mh0727/m300524/conda-envs/s2s-ai/lib/python3.7/site-packages/xarray/core/accessor_dt.py:381: FutureWarning: dt.weekofyear and dt.week have been deprecated. Please use dt.isocalendar().week instead.
FutureWarning,
/work/mh0727/m300524/conda-envs/s2s-ai/lib/python3.7/site-packages/xarray/core/accessor_dt.py:381: FutureWarning: dt.weekofyear and dt.week have been deprecated. Please use dt.isocalendar().week instead.
FutureWarning,
``` python
from scripts import skill_by_year
```
%% Cell type:code id: tags:
``` python
skill_by_year(preds_is)
```
%% Output
RPS
year
2000 0.463290
2001 0.501615
2002 0.498100
2003 0.499914
2004 0.533146
2005 0.486682
2006 0.492787
2007 0.555934
2008 0.507756
2009 0.515228
2010 0.498032
2011 0.548217
2012 0.556501
2013 0.519008
2014 0.521487
2015 0.507068
2016 0.520476
2017 0.590591
2018 0.604847
2019 0.546725
%% Cell type:markdown id: tags:
### `predict` test
%% Cell type:code id: tags:
``` python
preds_test = create_predictions(fct_2020, bias_2000_2019)
```
%% Output
/work/mh0727/m300524/conda-envs/s2s-ai/lib/python3.7/site-packages/xarray/core/accessor_dt.py:381: FutureWarning: dt.weekofyear and dt.week have been deprecated. Please use dt.isocalendar().week instead.
FutureWarning,
/work/mh0727/m300524/conda-envs/s2s-ai/lib/python3.7/site-packages/xarray/core/accessor_dt.py:381: FutureWarning: dt.weekofyear and dt.week have been deprecated. Please use dt.isocalendar().week instead.
FutureWarning,
%% Cell type:code id: tags:
``` python
skill_by_year(preds_test)
```
%% Output
RPS
year
2020 0.520714
%% Cell type:markdown id: tags:
# Submission
%% Cell type:code id: tags:
``` python
from scripts import assert_predictions_2020
assert_predictions_2020(preds_test)
```
%% Cell type:code id: tags:
``` python
del preds_test['weekofyear']
preds_test.attrs = {'author': 'Aaron Spring', 'author_email': 'aaron.spring@mpimet.mpg.de',
'comment': 'created for the s2s-ai-challenge as a template for the website',
'notebook': 'mean_bias_reduction.ipynb',
'website': 'https://s2s-ai-challenge.github.io/#evaluation'}
html_repr = xr.core.formatting_html.dataset_repr(preds_test)
with open('submission_template_repr.html', 'w') as myFile:
myFile.write(html_repr)
```
%% Cell type:code id: tags:
``` python
preds_test.to_netcdf('../submissions/ML_prediction_2020.nc')
```
%% Cell type:code id: tags:
``` python
#!git add ../submissions/ML_prediction_2020.nc
# !git add ../submissions/ML_prediction_2020.nc
# !git add mean_bias_reduction.ipynb
```
%% Cell type:code id: tags:
``` python
#!git commit -m "template_test no ML mean bias reduction" # whatever message you want
```
%% Cell type:code id: tags:
``` python
#!git tag "submission-no_ML_mean_bias_reduction-0.0.1" # if this is to be checked by scorer, only the last submitted==tagged version will be considered
#!git tag "submission-no_ML_mean_bias_reduction-0.0.2" # if this is to be checked by scorer, only the last submitted==tagged version will be considered
```
%% Cell type:code id: tags:
``` python
#!git push --tags
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
# Reproducibility
%% Cell type:markdown id: tags:
## memory
%% Cell type:code id: tags:
``` python
# https://phoenixnap.com/kb/linux-commands-check-memory-usage
!free -g
```
%% Output
total used free shared buffers cached
Mem: 62 21 41 0 0 5
-/+ buffers/cache: 15 47
Swap: 0 0 0
%% Cell type:markdown id: tags:
## CPU
%% Cell type:code id: tags:
``` python
!lscpu
```
%% Output
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 72
On-line CPU(s) list: 0-71
Thread(s) per core: 2
Core(s) per socket: 18
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU E5-2695 v4 @ 2.10GHz
Stepping: 1
CPU MHz: 1200.000
BogoMIPS: 4190.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 46080K
NUMA node0 CPU(s): 0-17,36-53
NUMA node1 CPU(s): 18-35,54-71
%% Cell type:markdown id: tags:
## software
%% Cell type:code id: tags:
``` python
!conda list
```
%% Output
# packages in environment at /opt/conda:
#
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 1_gnu conda-forge
_tflow_select 2.3.0 mkl defaults
absl-py 0.12.0 py38h06a4308_0 defaults
aiobotocore 1.2.2 pyhd3eb1b0_0 defaults
aiohttp 3.7.4.post0 pypi_0 pypi
aioitertools 0.7.1 pyhd3eb1b0_0 defaults
alembic 1.4.3 pyh9f0ad1d_0 conda-forge
ansiwrap 0.8.4 pypi_0 pypi
appdirs 1.4.4 pypi_0 pypi
argcomplete 1.12.2 pypi_0 pypi
argon2-cffi 20.1.0 py38h497a2fe_2 conda-forge
argparse 1.4.0 pypi_0 pypi
asciitree 0.3.3 py_2 defaults
astunparse 1.6.3 py_0 defaults
async-timeout 3.0.1 pypi_0 pypi
async_generator 1.10 py_0 conda-forge
attrs 20.3.0 pyhd3deb0d_0 conda-forge
backcall 0.2.0 pyh9f0ad1d_0 conda-forge
backports 1.0 py_2 conda-forge
backports.functools_lru_cache 1.6.1 py_0 conda-forge
binutils_impl_linux-64 2.35.1 h193b22a_1 conda-forge
binutils_linux-64 2.35 h67ddf6f_30 conda-forge
black 20.8b1 pypi_0 pypi
blas 1.0 mkl defaults
bleach 3.2.1 pyh9f0ad1d_0 conda-forge
blinker 1.4 py_1 conda-forge
bokeh 2.3.2 py38h06a4308_0 defaults
botocore 1.20.78 pyhd3eb1b0_1 defaults
bottleneck 1.3.2 py38heb32a55_1 defaults
branca 0.3.1 pypi_0 pypi
brotlipy 0.7.0 py38h497a2fe_1001 conda-forge
bzip2 1.0.8 h7f98852_4 conda-forge
c-ares 1.17.1 h36c2ea0_0 conda-forge
ca-certificates 2021.4.13 h06a4308_1 defaults
cachetools 4.2.2 pyhd3eb1b0_0 defaults
cdsapi 0.5.1 pypi_0 pypi
certifi 2020.12.5 py38h06a4308_0 defaults
certipy 0.1.3 py_0 conda-forge
cffi 1.14.4 py38ha65f79e_1 conda-forge
cfgrib 0.9.9.0 pyhd8ed1ab_1 conda-forge
cftime 1.5.0 py38h6323ea4_0 defaults
chardet 4.0.0 py38h578d9bd_1 conda-forge
click 7.1.2 pypi_0 pypi
climetlab 0.7.0 pypi_0 pypi
climetlab-s2s-ai-challenge 0.6.2 pypi_0 pypi
cloudpickle 1.6.0 py_0 defaults
colorama 0.4.4 pypi_0 pypi
conda 4.9.2 py38h578d9bd_0 conda-forge
conda-package-handling 1.7.2 py38h8df0ef7_0 conda-forge
configargparse 1.4.1 pypi_0 pypi
configurable-http-proxy 1.3.0 0 conda-forge
coverage 5.5 py38h27cfd23_2 defaults
cryptography 3.3.1 py38h2b97feb_1 conda-forge
curl 7.71.1 he644dc0_8 conda-forge
cycler 0.10.0 py38_0 defaults
cython 0.29.23 py38h2531618_0 defaults
cytoolz 0.11.0 py38h7b6447c_0 defaults
dask 2021.4.0 pyhd3eb1b0_0 defaults
dask-core 2021.4.0 pyhd3eb1b0_0 defaults
decorator 4.4.2 py_0 conda-forge
defusedxml 0.6.0 py_0 conda-forge
distributed 2021.5.0 py38h06a4308_0 defaults
distro 1.5.0 pypi_0 pypi
eccodes 2.18.0 hf05d9b7_0 conda-forge
ecmwf-api-client 1.6.1 pypi_0 pypi
ecmwflibs 0.3.7 pypi_0 pypi
entrypoints 0.3 pyhd8ed1ab_1003 conda-forge
fasteners 0.16 pyhd3eb1b0_0 defaults
findlibs 0.0.2 pypi_0 pypi
folium 0.12.1 pypi_0 pypi
freetype 2.10.4 h5ab3b9f_0 defaults
fsspec 0.9.0 pyhd3eb1b0_0 defaults
gast 0.4.0 py_0 defaults
gcc_impl_linux-64 9.3.0 h70c0ae5_18 conda-forge
gcc_linux-64 9.3.0 hf25ea35_30 conda-forge
gitdb 4.0.7 pypi_0 pypi
gitpython 3.1.14 pypi_0 pypi
google-auth 1.30.1 pyhd3eb1b0_0 defaults
google-auth-oauthlib 0.4.4 pyhd3eb1b0_0 defaults
google-pasta 0.2.0 py_0 defaults
grpcio 1.36.1 py38h2157cd5_1 defaults
gxx_impl_linux-64 9.3.0 hd87eabc_18 conda-forge
gxx_linux-64 9.3.0 h3fbe746_30 conda-forge
h5py 2.10.0 py38hd6299e0_1 defaults
hdf4 4.2.13 h3ca952b_2 defaults
hdf5 1.10.6 nompi_h3c11f04_101 conda-forge
heapdict 1.0.1 py_0 defaults
icu 68.1 h58526e2_0 conda-forge
idna 2.10 pyh9f0ad1d_0 conda-forge
importlib-metadata 3.4.0 py38h578d9bd_0 conda-forge
importlib_metadata 3.4.0 hd8ed1ab_0 conda-forge
intel-openmp 2021.2.0 h06a4308_610 defaults
ipykernel 5.4.2 py38h81c977d_0 conda-forge
ipython 7.19.0 py38h81c977d_2 conda-forge
ipython_genutils 0.2.0 py_1 conda-forge
jasper 1.900.1 hd497a04_4 defaults
jedi 0.17.2 py38h578d9bd_1 conda-forge
jinja2 2.11.2 pyh9f0ad1d_0 conda-forge
jmespath 0.10.0 py_0 defaults
joblib 1.0.1 pyhd3eb1b0_0 defaults
jpeg 9d h36c2ea0_0 conda-forge
json5 0.9.5 pyh9f0ad1d_0 conda-forge
jsonschema 3.2.0 py_2 conda-forge
jupyter-server-proxy 1.6.0 pypi_0 pypi
jupyter_client 6.1.11 pyhd8ed1ab_1 conda-forge
jupyter_core 4.7.0 py38h578d9bd_0 conda-forge
jupyter_telemetry 0.1.0 pyhd8ed1ab_1 conda-forge
jupyterhub 1.2.2 pypi_0 pypi
jupyterlab 2.2.9 py_0 conda-forge
jupyterlab-git 0.23.3 pypi_0 pypi
jupyterlab_pygments 0.1.2 pyh9f0ad1d_0 conda-forge
jupyterlab_server 1.2.0 py_0 conda-forge
keras-preprocessing 1.1.2 pyhd3eb1b0_0 defaults
kernel-headers_linux-64 2.6.32 h77966d4_13 conda-forge
kiwisolver 1.3.1 py38h2531618_0 defaults
krb5 1.17.2 h926e7f8_0 conda-forge
lcms2 2.12 h3be6417_0 defaults
ld_impl_linux-64 2.35.1 hea4e1c9_1 conda-forge
libaec 1.0.4 he6710b0_1 defaults
libcurl 7.71.1 hcdd3856_8 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libffi 3.3 h58526e2_2 conda-forge
libgcc-devel_linux-64 9.3.0 h7864c58_18 conda-forge
libgcc-ng 9.3.0 h2828fa1_18 conda-forge
libgfortran-ng 7.3.0 hdf63c60_0 defaults
libgomp 9.3.0 h2828fa1_18 conda-forge
libllvm10 10.0.1 hbcb73fb_5 defaults
libnetcdf 4.7.4 nompi_h56d31a8_107 conda-forge
libnghttp2 1.41.0 h8cfc5f6_2 conda-forge
libpng 1.6.37 hbc83047_0 defaults
libprotobuf 3.14.0 h8c45485_0 defaults
libsodium 1.0.18 h36c2ea0_1 conda-forge
libssh2 1.9.0 hab1572f_5 conda-forge
libstdcxx-devel_linux-64 9.3.0 hb016644_18 conda-forge
libstdcxx-ng 9.3.0 h6de172a_18 conda-forge
libtiff 4.1.0 h2733197_1 defaults
libuv 1.40.0 h7f98852_0 conda-forge
llvmlite 0.36.0 py38h612dafd_4 defaults
locket 0.2.1 py38h06a4308_1 defaults
lz4-c 1.9.3 h2531618_0 defaults
magics 1.5.6 pypi_0 pypi
mako 1.1.4 pyh44b312d_0 conda-forge
markdown 3.3.4 py38h06a4308_0 defaults
markupsafe 1.1.1 py38h497a2fe_3 conda-forge
matplotlib-base 3.3.4 py38h62a2d02_0 defaults
mistune 0.8.4 py38h497a2fe_1003 conda-forge
mkl 2021.2.0 h06a4308_296 defaults
mkl-service 2.3.0 py38h27cfd23_1 defaults
mkl_fft 1.3.0 py38h42c9631_2 defaults
mkl_random 1.2.1 py38ha9443f7_2 defaults
monotonic 1.5 py_0 defaults
msgpack-python 1.0.2 py38hff7bd54_1 defaults
multidict 5.1.0 py38h27cfd23_2 defaults
mypy-extensions 0.4.3 pypi_0 pypi
nbclient 0.5.0 pypi_0 pypi
nbconvert 6.0.7 py38h578d9bd_3 conda-forge
nbdime 2.1.0 pypi_0 pypi
nbformat 5.1.2 pyhd8ed1ab_1 conda-forge
nbresuse 0.4.0 pypi_0 pypi
ncurses 6.2 h58526e2_4 conda-forge
nest-asyncio 1.4.3 pyhd8ed1ab_0 conda-forge
netcdf4 1.5.6 pypi_0 pypi
nodejs 15.3.0 h25f6087_0 conda-forge
notebook 6.2.0 py38h578d9bd_0 conda-forge
numba 0.53.1 py38ha9443f7_0 defaults
numcodecs 0.7.3 py38h2531618_0 defaults
numpy 1.20.2 py38h2d18471_0 defaults
numpy-base 1.20.2 py38hfae3a4d_0 defaults
oauthlib 3.0.1 py_0 conda-forge
olefile 0.46 py_0 defaults
openssl 1.1.1k h27cfd23_0 defaults
opt_einsum 3.3.0 pyhd3eb1b0_1 defaults
packaging 20.8 pyhd3deb0d_0 conda-forge
pamela 1.0.0 py_0 conda-forge
pandas 1.2.4 py38h2531618_0 defaults
pandoc 2.11.3.2 h7f98852_0 conda-forge
pandocfilters 1.4.2 py_1 conda-forge
papermill 2.3.1 pypi_0 pypi
parso 0.7.1 pyh9f0ad1d_0 conda-forge
partd 1.2.0 pyhd3eb1b0_0 defaults
pathspec 0.8.1 pypi_0 pypi
pdbufr 0.8.2 pypi_0 pypi
pexpect 4.8.0 pyh9f0ad1d_2 conda-forge
pickleshare 0.7.5 py_1003 conda-forge
pillow 8.2.0 py38he98fc37_0 defaults
pip 21.0.1 pypi_0 pypi
pipx 0.16.1.0 pypi_0 pypi
powerline-shell 0.7.0 pypi_0 pypi
prometheus_client 0.9.0 pyhd3deb0d_0 conda-forge
prompt-toolkit 3.0.10 pyha770c72_0 conda-forge
properscoring 0.1 py_0 conda-forge
protobuf 3.14.0 py38h2531618_1 defaults
psutil 5.8.0 py38h27cfd23_1 defaults
ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge
pyasn1 0.4.8 py_0 defaults
pyasn1-modules 0.2.8 py_0 defaults
pycosat 0.6.3 py38h497a2fe_1006 conda-forge
pycparser 2.20 pyh9f0ad1d_2 conda-forge
pycurl 7.43.0.6 py38h996a351_1 conda-forge
pygments 2.7.4 pyhd8ed1ab_0 conda-forge
pyjwt 2.0.1 pyhd8ed1ab_0 conda-forge
pyodc 1.0.3 pypi_0 pypi
pyopenssl 20.0.1 pyhd8ed1ab_0 conda-forge
pyparsing 2.4.7 pyh9f0ad1d_0 conda-forge
pyrsistent 0.17.3 py38h497a2fe_2 conda-forge
pysocks 1.7.1 py38h578d9bd_3 conda-forge
python 3.8.6 hffdb5ce_4_cpython conda-forge
python-dateutil 2.8.1 py_0 conda-forge
python-eccodes 2021.03.0 py38hb5d20a5_0 conda-forge
python-editor 1.0.4 py_0 conda-forge
python-flatbuffers 1.12 pyhd3eb1b0_0 defaults
python-json-logger 2.0.1 pyh9f0ad1d_0 conda-forge
python_abi 3.8 1_cp38 conda-forge
pytz 2021.1 pyhd3eb1b0_0 defaults
pyyaml 5.4.1 pypi_0 pypi
pyzmq 21.0.1 py38h3d7ac18_0 conda-forge
readline 8.0 he28a2e2_2 conda-forge
regex 2021.4.4 pypi_0 pypi
requests 2.25.1 pyhd3deb0d_0 conda-forge
requests-oauthlib 1.3.0 py_0 defaults
rsa 4.7.2 pyhd3eb1b0_1 defaults
ruamel.yaml 0.16.12 py38h497a2fe_2 conda-forge
ruamel.yaml.clib 0.2.2 py38h497a2fe_2 conda-forge
ruamel_yaml 0.15.80 py38h497a2fe_1003 conda-forge
s3fs 0.6.0 pyhd3eb1b0_0 defaults
scikit-learn 0.24.2 py38ha9443f7_0 defaults
scipy 1.6.2 py38had2a1c9_1 defaults
send2trash 1.5.0 py_0 conda-forge
setuptools 49.6.0 py38h578d9bd_3 conda-forge
simpervisor 0.4 pypi_0 pypi
six 1.15.0 pyh9f0ad1d_0 conda-forge
sklearn-xarray 0.4.0 pypi_0 pypi
smmap 4.0.0 pypi_0 pypi
sortedcontainers 2.3.0 pyhd3eb1b0_0 defaults
sqlalchemy 1.3.22 py38h497a2fe_1 conda-forge
sqlite 3.34.0 h74cdb3f_0 conda-forge
sysroot_linux-64 2.12 h77966d4_13 conda-forge
tbb 2020.3 hfd86e86_0 defaults
tblib 1.7.0 py_0 defaults
tenacity 7.0.0 pypi_0 pypi
tensorboard 2.4.0 pyhc547734_0 defaults
tensorboard-plugin-wit 1.6.0 py_0 defaults
tensorflow 2.4.1 mkl_py38hb2083e0_0 defaults
tensorflow-base 2.4.1 mkl_py38h43e0292_0 defaults
tensorflow-estimator 2.4.1 pyheb71bc4_0 defaults
termcolor 1.1.0 py38h06a4308_1 defaults
terminado 0.9.2 py38h578d9bd_0 conda-forge
testpath 0.4.4 py_0 conda-forge
textwrap3 0.9.2 pypi_0 pypi
threadpoolctl 2.1.0 pyh5ca1d4c_0 defaults
tini 0.18.0 h14c3975_1001 conda-forge
tk 8.6.10 h21135ba_1 conda-forge
toml 0.10.2 pypi_0 pypi
toolz 0.11.1 pyhd3eb1b0_0 defaults
tornado 6.1 py38h497a2fe_1 conda-forge
tqdm 4.56.0 pyhd8ed1ab_0 conda-forge
traitlets 5.0.5 py_0 conda-forge
typed-ast 1.4.2 pypi_0 pypi
typing-extensions 3.7.4.3 hd3eb1b0_0 defaults
typing_extensions 3.7.4.3 pyh06a4308_0 defaults
urllib3 1.26.2 pyhd8ed1ab_0 conda-forge
userpath 1.4.2 pypi_0 pypi
wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge
webencodings 0.5.1 py_1 conda-forge
werkzeug 1.0.1 pyhd3eb1b0_0 defaults
wheel 0.36.2 pyhd3deb0d_0 conda-forge
wrapt 1.12.1 py38h7b6447c_1 defaults
xarray 0.18.0 pyhd3eb1b0_1 defaults
xhistogram 0.1.2 pyhd8ed1ab_0 conda-forge
xskillscore 0.0.20 pyhd8ed1ab_1 conda-forge
xz 5.2.5 h516909a_1 conda-forge
yaml 0.2.5 h516909a_0 conda-forge
yarl 1.6.3 py38h27cfd23_0 defaults
zarr 2.8.1 pyhd3eb1b0_0 defaults
zeromq 4.3.3 h58526e2_3 conda-forge
zict 2.0.0 pyhd3eb1b0_0 defaults
zipp 3.4.0 py_0 conda-forge
zlib 1.2.11 h516909a_1010 conda-forge
zstd 1.4.9 haebb681_0 defaults
%% Cell type:code id: tags:
``` python
```
......