Compare revisions

9019caf1 · 9019caf1 · 9019caf1 · 9019caf1 · 9019caf1 · 9019caf1
--- a/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/tp/2.0.3.0.1
+++ b/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/tp/2.0.3.0.1
--- a/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/tp/2.1.0.0.0
+++ b/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/tp/2.1.0.0.0
--- a/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/tp/2.1.0.0.1
+++ b/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/tp/2.1.0.0.1
--- a/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/tp/2.1.1.0.0
+++ b/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/tp/2.1.1.0.0
--- a/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/tp/2.1.1.0.1
+++ b/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/tp/2.1.1.0.1
--- a/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/tp/2.1.2.0.0
+++ b/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/tp/2.1.2.0.0
--- a/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/tp/2.1.2.0.1
+++ b/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/tp/2.1.2.0.1
--- a/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/tp/2.1.3.0.0
+++ b/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/tp/2.1.3.0.0
--- a/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/tp/2.1.3.0.1
+++ b/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/tp/2.1.3.0.1
--- a/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/valid_time/.zarray
+++ b/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/valid_time/.zarray
--- a/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/valid_time/.zattrs
+++ b/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/valid_time/.zattrs
--- a/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/valid_time/0.0
+++ b/data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr/valid_time/0.0
--- a/docs/screenshots/fork_renku.png
+++ b/docs/screenshots/fork_renku.png
--- a/docs/screenshots/gitlab_add_variable.png
+++ b/docs/screenshots/gitlab_add_variable.png
--- a/docs/screenshots/gitlab_variables.png
+++ b/docs/screenshots/gitlab_variables.png
--- a/docs/screenshots/renku_start_env.png
+++ b/docs/screenshots/renku_start_env.png
--- a/docs/screenshots/s2s-ai-challenge-tag.png
+++ b/docs/screenshots/s2s-ai-challenge-tag.png
--- a/environment.yml
+++ b/environment.yml
@@ -3,16 +3,14 @@ channels:
  - defaults
 dependencies:
  - xarray
-  - dask
  # ML
  - tensorflow
  - pytorch
-  - sklearn
  # viz
-  - matplotlib
-  - cartopy
+  - matplotlib-base
+  # - cartopy
  # scoring
-  - xskillscore
+  - xskillscore>=0.0.20  # includes sklearn
  # data access
  - intake
  - fsspec
@@ -20,18 +18,15 @@ dependencies:
  - s3fs
  - intake-xarray
  - cfgrib
+  - eccodes
+  - nc-time-axis
  - pydap
-  - cftime
  - h5netcdf
-  - netcdf4==1.5.1  # see https://github.com/pydata/xarray/issues/4925
+  - netcdf4
  - pip
  - pip:
-    - climetlab
-    - climetlab_s2s_ai_challenge
+    - climetlab >= 0.8.0
+    - climetlab_s2s_ai_challenge >= 0.7.1
    - configargparse # for weatherbench
-    - rechunker
-    - git+https://github.com/xarray-contrib/xskillscore.git
-    - git+https://github.com/phausamann/sklearn-xarray.git@develop
-    #- dask-labextension
-    #- nb_black
+    - netcdf4==1.5.4
 prefix: "/opt/conda"
--- a/notebooks/ML_forecast_template.ipynb
+++ b/notebooks/ML_forecast_template.ipynb
+%% Cell type:markdown id: tags:
+
+# Train ML model for predictions of week 3-4 & 5-6
+
+This notebook create a Machine Learning `ML_model` to predict weeks 3-4 & 5-6 based on `S2S` weeks 3-4 & 5-6 forecasts and is compared to `CPC` observations for the [`s2s-ai-challenge`](https://s2s-ai-challenge.github.io/).
+
+%% Cell type:markdown id: tags:
+
+# Synopsis
+
+%% Cell type:markdown id: tags:
+
+## Method: `name`
+
+- decription
+- a few details
+
+%% Cell type:markdown id: tags:
+
+## Data used
+
+Training-input for Machine Learning model:
+- renku datasets, climetlab, IRIDL
+
+Forecast-input for Machine Learning model:
+- renku datasets, climetlab, IRIDL
+
+Compare Machine Learning model forecast against ground truth:
+- renku datasets, climetlab, IRIDL
+
+%% Cell type:markdown id: tags:
+
+## Resources used
+for training, details in reproducibility
+
+- platform: renku
+- memory: 8 GB
+- processors: 2 CPU
+- storage required: 10 GB
+
+%% Cell type:markdown id: tags:
+
+## Safeguards
+
+All points have to be [x] checked. If not, your submission is invalid.
+
+Changes to the code after submissions are not possible, as the `commit` before the `tag` will be reviewed.
+(Only in exceptions and if previous effort in reproducibility can be found, it may be allowed to improve readability and reproducibility after November 1st 2021.)
+
+%% Cell type:markdown id: tags:
+
+### Safeguards to prevent [overfitting](https://en.wikipedia.org/wiki/Overfitting?wprov=sfti1)
+
+If the organizers suspect overfitting, your contribution can be disqualified.
+
+  - [ ] We did not use 2020 observations in training (explicit overfitting and cheating)
+  - [ ] We did not repeatedly verify my model on 2020 observations and incrementally improved my RPSS (implicit overfitting)
+  - [ ] We provide RPSS scores for the training period with script `skill_by_year`, see in section 6.3 `predict`.
+  - [ ] We tried our best to prevent [data leakage](https://en.wikipedia.org/wiki/Leakage_(machine_learning)?wprov=sfti1).
+  - [ ] We honor the `train-validate-test` [split principle](https://en.wikipedia.org/wiki/Training,_validation,_and_test_sets). This means that the hindcast data is split into `train` and `validate`, whereas `test` is withheld.
+  - [ ] We did not use `test` explicitly in training or implicitly in incrementally adjusting parameters.
+  - [ ] We considered [cross-validation](https://en.wikipedia.org/wiki/Cross-validation_(statistics)).
+
+%% Cell type:markdown id: tags:
+
+### Safeguards for Reproducibility
+Notebook/code must be independently reproducible from scratch by the organizers (after the competition), if not possible: no prize
+  - [ ] All training data is publicly available (no pre-trained private neural networks, as they are not reproducible for us)
+  - [ ] Code is well documented, readable and reproducible.
+  - [ ] Code to reproduce training and predictions is preferred to run within a day on the described architecture. If the training takes longer than a day, please justify why this is needed. Please do not submit training piplelines, which take weeks to train.
+
+%% Cell type:markdown id: tags:
+
+# Todos to improve template
+
+This is just a demo.
+
+- [ ] for both variables
+- [ ] for both `lead_time`s
+- [ ] ensure probabilistic prediction outcome with `category` dim
+
+%% Cell type:markdown id: tags:
+
+# Imports
+
+%% Cell type:code id: tags:
+
+``` python
+from tensorflow.keras.layers import Input, Dense, Flatten
+from tensorflow.keras.models import Sequential
+
+import matplotlib.pyplot as plt
+
+import xarray as xr
+xr.set_options(display_style='text')
+
+from dask.utils import format_bytes
+import xskillscore as xs
+```
+
+%% Cell type:markdown id: tags:
+
+# Get training data
+
+preprocessing of input data may be done in separate notebook/script
+
+%% Cell type:markdown id: tags:
+
+## Hindcast
+
+get weekly initialized hindcasts
+
+%% Cell type:code id: tags:
+
+``` python
+# consider renku datasets
+#! renku storage pull path
+```
+
+%% Cell type:code id: tags:
+
+``` python
+```
+
+%% Cell type:markdown id: tags:
+
+## Observations
+corresponding to hindcasts
+
+%% Cell type:code id: tags:
+
+``` python
+# consider renku datasets
+#! renku storage pull path
+```
+
+%% Cell type:code id: tags:
+
+``` python
+```
+
+%% Cell type:markdown id: tags:
+
+# ML model
+
+%% Cell type:code id: tags:
+
+``` python
+bs=32
+
+import numpy as np
+class DataGenerator(keras.utils.Sequence):
+    def __init__(self):
+        """
+        Data generator
+
+        Template from https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly
+
+        Args:
+
+        """
+
+        self.on_epoch_end()
+
+        # For some weird reason calling .load() earlier messes up the mean and std computations
+        if load: print('Loading data into RAM'); self.data.load()
+
+    def __len__(self):
+        'Denotes the number of batches per epoch'
+        return int(np.ceil(self.n_samples / self.batch_size))
+
+    def __getitem__(self, i):
+        'Generate one batch of data'
+        idxs = self.idxs[i * self.batch_size:(i + 1) * self.batch_size]
+        # got all nan if nans not masked
+        X = self.data.isel(time=idxs).fillna(0.).values
+        y = self.verif_data.isel(time=idxs).fillna(0.).values
+        return X, y
+
+    def on_epoch_end(self):
+        'Updates indexes after each epoch'
+        self.idxs = np.arange(self.n_samples)
+        if self.shuffle == True:
+            np.random.shuffle(self.idxs)
+```
+
+%% Cell type:markdown id: tags:
+
+## data prep: train, valid, test
+
+%% Cell type:code id: tags:
+
+``` python
+# time is the forecast_reference_time
+time_train_start,time_train_end='2000','2017'
+time_valid_start,time_valid_end='2018','2019'
+time_test = '2020'
+```
+
+%% Cell type:code id: tags:
+
+``` python
+dg_train = DataGenerator()
+```
+
+%% Cell type:code id: tags:
+
+``` python
+dg_valid = DataGenerator()
+```
+
+%% Cell type:code id: tags:
+
+``` python
+dg_test = DataGenerator()
+```
+
+%% Cell type:markdown id: tags:
+
+## `fit`
+
+%% Cell type:code id: tags:
+
+``` python
+cnn = keras.models.Sequential([])
+```
+
+%% Cell type:code id: tags:
+
+``` python
+cnn.summary()
+```
+
+%% Cell type:code id: tags:
+
+``` python
+cnn.compile(keras.optimizers.Adam(1e-4), 'mse')
+```
+
+%% Cell type:code id: tags:
+
+``` python
+import warnings
+warnings.simplefilter("ignore")
+```
+
+%% Cell type:code id: tags:
+
+``` python
+cnn.fit(dg_train, epochs=1, validation_data=dg_valid)
+```
+
+%% Cell type:markdown id: tags:
+
+## `predict`
+
+Create predictions and print `mean(variable, lead_time, longitude, weighted latitude)` RPSS for all years as calculated by `skill_by_year`.
+
+%% Cell type:code id: tags:
+
+``` python
+from scripts import skill_by_year
+```
+
+%% Cell type:code id: tags:
+
+``` python
+def create_predictions(model, dg):
+    """Create non-iterative predictions"""
+    preds = model.predict(dg).squeeze()
+    # transform
+
+    return preds
+```
+
+%% Cell type:markdown id: tags:
+
+### `predict` training period in-sample
+
+%% Cell type:code id: tags:
+
+``` python
+preds_is = create_predictions(cnn, dg_train)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+skill_by_year(preds_is)
+```
+
+%% Cell type:markdown id: tags:
+
+### `predict` valid out-of-sample
+
+%% Cell type:code id: tags:
+
+``` python
+preds_os = create_predictions(cnn, dg_valid)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+skill_by_year(preds_os)
+```
+
+%% Cell type:markdown id: tags:
+
+### `predict` test
+
+%% Cell type:code id: tags:
+
+``` python
+preds_test = create_predictions(cnn, dg_test)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+skill_by_year(preds_test)
+```
+
+%% Cell type:markdown id: tags:
+
+# Submission
+
+%% Cell type:code id: tags:
+
+``` python
+preds_test.sizes # expect: category(3), longitude, latitude, lead_time(2), forecast_time (53)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+from scripts import assert_predictions_2020
+assert_predictions_2020(preds_test)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+preds_test.to_netcdf('../submissions/ML_prediction_2020.nc')
+```
+
+%% Cell type:code id: tags:
+
+``` python
+#!git add ../submissions/ML_prediction_2020.nc
+#!git add ML_forecast_template.ipynb
+```
+
+%% Cell type:code id: tags:
+
+``` python
+#!git commit -m "commit submission for my_method_name" # whatever message you want
+```
+
+%% Cell type:code id: tags:
+
+``` python
+#!git tag "submission-my_method_name-0.0.1" # if this is to be checked by scorer, only the last submitted==tagged version will be considered
+```
+
+%% Cell type:code id: tags:
+
+``` python
+#!git push --tags
+```
+
+%% Cell type:code id: tags:
+
+``` python
+```
+
+%% Cell type:markdown id: tags:
+
+# Reproducibility
+
+%% Cell type:markdown id: tags:
+
+## memory
+
+%% Cell type:code id: tags:
+
+``` python
+# https://phoenixnap.com/kb/linux-commands-check-memory-usage
+!free -g
+```
+
+%% Cell type:markdown id: tags:
+
+## CPU
+
+%% Cell type:code id: tags:
+
+``` python
+!lscpu
+```
+
+%% Cell type:markdown id: tags:
+
+## software
+
+%% Cell type:code id: tags:
+
+``` python
+!conda list
+```
+
+%% Cell type:code id: tags:
+
+``` python
+```
+%% Cell type:markdown id: tags:
+
+# Train ML model for predictions of week 3-4 & 5-6
+
+This notebook create a Machine Learning `ML_model` to predict weeks 3-4 & 5-6 based on `S2S` weeks 3-4 & 5-6 forecasts and is compared to `CPC` observations for the [`s2s-ai-challenge`](https://s2s-ai-challenge.github.io/).
+
+%% Cell type:markdown id: tags:
+
+# Synopsis
+
+%% Cell type:markdown id: tags:
+
+## Method: `name`
+
+- decription
+- a few details
+
+%% Cell type:markdown id: tags:
+
+## Data used
+
+Training-input for Machine Learning model:
+- renku datasets, climetlab, IRIDL
+
+Forecast-input for Machine Learning model:
+- renku datasets, climetlab, IRIDL
+
+Compare Machine Learning model forecast against ground truth:
+- renku datasets, climetlab, IRIDL
+
+%% Cell type:markdown id: tags:
+
+## Resources used
+for training, details in reproducibility
+
+- platform: renku
+- memory: 8 GB
+- processors: 2 CPU
+- storage required: 10 GB
+
+%% Cell type:markdown id: tags:
+
+## Safeguards
+
+All points have to be [x] checked. If not, your submission is invalid.
+
+Changes to the code after submissions are not possible, as the `commit` before the `tag` will be reviewed.
+(Only in exceptions and if previous effort in reproducibility can be found, it may be allowed to improve readability and reproducibility after November 1st 2021.)
+
+%% Cell type:markdown id: tags:
+
+### Safeguards to prevent [overfitting](https://en.wikipedia.org/wiki/Overfitting?wprov=sfti1)
+
+If the organizers suspect overfitting, your contribution can be disqualified.
+
+  - [ ] We did not use 2020 observations in training (explicit overfitting and cheating)
+  - [ ] We did not repeatedly verify my model on 2020 observations and incrementally improved my RPSS (implicit overfitting)
+  - [ ] We provide RPSS scores for the training period with script `skill_by_year`, see in section 6.3 `predict`.
+  - [ ] We tried our best to prevent [data leakage](https://en.wikipedia.org/wiki/Leakage_(machine_learning)?wprov=sfti1).
+  - [ ] We honor the `train-validate-test` [split principle](https://en.wikipedia.org/wiki/Training,_validation,_and_test_sets). This means that the hindcast data is split into `train` and `validate`, whereas `test` is withheld.
+  - [ ] We did not use `test` explicitly in training or implicitly in incrementally adjusting parameters.
+  - [ ] We considered [cross-validation](https://en.wikipedia.org/wiki/Cross-validation_(statistics)).
+
+%% Cell type:markdown id: tags:
+
+### Safeguards for Reproducibility
+Notebook/code must be independently reproducible from scratch by the organizers (after the competition), if not possible: no prize
+  - [ ] All training data is publicly available (no pre-trained private neural networks, as they are not reproducible for us)
+  - [ ] Code is well documented, readable and reproducible.
+  - [ ] Code to reproduce training and predictions is preferred to run within a day on the described architecture. If the training takes longer than a day, please justify why this is needed. Please do not submit training piplelines, which take weeks to train.
+
+%% Cell type:markdown id: tags:
+
+# Todos to improve template
+
+This is just a demo.
+
+- [ ] for both variables
+- [ ] for both `lead_time`s
+- [ ] ensure probabilistic prediction outcome with `category` dim
+
+%% Cell type:markdown id: tags:
+
+# Imports
+
+%% Cell type:code id: tags:
+
+``` python
+from tensorflow.keras.layers import Input, Dense, Flatten
+from tensorflow.keras.models import Sequential
+
+import matplotlib.pyplot as plt
+
+import xarray as xr
+xr.set_options(display_style='text')
+
+from dask.utils import format_bytes
+import xskillscore as xs
+```
+
+%% Cell type:markdown id: tags:
+
+# Get training data
+
+preprocessing of input data may be done in separate notebook/script
+
+%% Cell type:markdown id: tags:
+
+## Hindcast
+
+get weekly initialized hindcasts
+
+%% Cell type:code id: tags:
+
+``` python
+# consider renku datasets
+#! renku storage pull path
+```
+
+%% Cell type:code id: tags:
+
+``` python
+```
+
+%% Cell type:markdown id: tags:
+
+## Observations
+corresponding to hindcasts
+
+%% Cell type:code id: tags:
+
+``` python
+# consider renku datasets
+#! renku storage pull path
+```
+
+%% Cell type:code id: tags:
+
+``` python
+```
+
+%% Cell type:markdown id: tags:
+
+# ML model
+
+%% Cell type:code id: tags:
+
+``` python
+bs=32
+
+import numpy as np
+class DataGenerator(keras.utils.Sequence):
+    def __init__(self):
+        """
+        Data generator
+
+        Template from https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly
+
+        Args:
+
+        """
+
+        self.on_epoch_end()
+
+        # For some weird reason calling .load() earlier messes up the mean and std computations
+        if load: print('Loading data into RAM'); self.data.load()
+
+    def __len__(self):
+        'Denotes the number of batches per epoch'
+        return int(np.ceil(self.n_samples / self.batch_size))
+
+    def __getitem__(self, i):
+        'Generate one batch of data'
+        idxs = self.idxs[i * self.batch_size:(i + 1) * self.batch_size]
+        # got all nan if nans not masked
+        X = self.data.isel(time=idxs).fillna(0.).values
+        y = self.verif_data.isel(time=idxs).fillna(0.).values
+        return X, y
+
+    def on_epoch_end(self):
+        'Updates indexes after each epoch'
+        self.idxs = np.arange(self.n_samples)
+        if self.shuffle == True:
+            np.random.shuffle(self.idxs)
+```
+
+%% Cell type:markdown id: tags:
+
+## data prep: train, valid, test
+
+%% Cell type:code id: tags:
+
+``` python
+# time is the forecast_reference_time
+time_train_start,time_train_end='2000','2017'
+time_valid_start,time_valid_end='2018','2019'
+time_test = '2020'
+```
+
+%% Cell type:code id: tags:
+
+``` python
+dg_train = DataGenerator()
+```
+
+%% Cell type:code id: tags:
+
+``` python
+dg_valid = DataGenerator()
+```
+
+%% Cell type:code id: tags:
+
+``` python
+dg_test = DataGenerator()
+```
+
+%% Cell type:markdown id: tags:
+
+## `fit`
+
+%% Cell type:code id: tags:
+
+``` python
+cnn = keras.models.Sequential([])
+```
+
+%% Cell type:code id: tags:
+
+``` python
+cnn.summary()
+```
+
+%% Cell type:code id: tags:
+
+``` python
+cnn.compile(keras.optimizers.Adam(1e-4), 'mse')
+```
+
+%% Cell type:code id: tags:
+
+``` python
+import warnings
+warnings.simplefilter("ignore")
+```
+
+%% Cell type:code id: tags:
+
+``` python
+cnn.fit(dg_train, epochs=1, validation_data=dg_valid)
+```
+
+%% Cell type:markdown id: tags:
+
+## `predict`
+
+Create predictions and print `mean(variable, lead_time, longitude, weighted latitude)` RPSS for all years as calculated by `skill_by_year`.
+
+%% Cell type:code id: tags:
+
+``` python
+from scripts import skill_by_year
+```
+
+%% Cell type:code id: tags:
+
+``` python
+def create_predictions(model, dg):
+    """Create non-iterative predictions"""
+    preds = model.predict(dg).squeeze()
+    # transform
+
+    return preds
+```
+
+%% Cell type:markdown id: tags:
+
+### `predict` training period in-sample
+
+%% Cell type:code id: tags:
+
+``` python
+preds_is = create_predictions(cnn, dg_train)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+skill_by_year(preds_is)
+```
+
+%% Cell type:markdown id: tags:
+
+### `predict` valid out-of-sample
+
+%% Cell type:code id: tags:
+
+``` python
+preds_os = create_predictions(cnn, dg_valid)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+skill_by_year(preds_os)
+```
+
+%% Cell type:markdown id: tags:
+
+### `predict` test
+
+%% Cell type:code id: tags:
+
+``` python
+preds_test = create_predictions(cnn, dg_test)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+skill_by_year(preds_test)
+```
+
+%% Cell type:markdown id: tags:
+
+# Submission
+
+%% Cell type:code id: tags:
+
+``` python
+preds_test.sizes # expect: category(3), longitude, latitude, lead_time(2), forecast_time (53)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+from scripts import assert_predictions_2020
+assert_predictions_2020(preds_test)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+preds_test.to_netcdf('../submissions/ML_prediction_2020.nc')
+```
+
+%% Cell type:code id: tags:
+
+``` python
+#!git add ../submissions/ML_prediction_2020.nc
+#!git add ML_forecast_template.ipynb
+```
+
+%% Cell type:code id: tags:
+
+``` python
+#!git commit -m "commit submission for my_method_name" # whatever message you want
+```
+
+%% Cell type:code id: tags:
+
+``` python
+#!git tag "submission-my_method_name-0.0.1" # if this is to be checked by scorer, only the last submitted==tagged version will be considered
+```
+
+%% Cell type:code id: tags:
+
+``` python
+#!git push --tags
+```
+
+%% Cell type:code id: tags:
+
+``` python
+```
+
+%% Cell type:markdown id: tags:
+
+# Reproducibility
+
+%% Cell type:markdown id: tags:
+
+## memory
+
+%% Cell type:code id: tags:
+
+``` python
+# https://phoenixnap.com/kb/linux-commands-check-memory-usage
+!free -g
+```
+
+%% Cell type:markdown id: tags:
+
+## CPU
+
+%% Cell type:code id: tags:
+
+``` python
+!lscpu
+```
+
+%% Cell type:markdown id: tags:
+
+## software
+
+%% Cell type:code id: tags:
+
+``` python
+!conda list
+```
+
+%% Cell type:code id: tags:
+
+``` python
+```
--- a/notebooks/ML_train_and_predict.ipynb
+++ b/notebooks/ML_train_and_predict.ipynb
+%% Cell type:markdown id: tags:
+
+# Train ML model to correct predictions of week 3-4 & 5-6
+
+This notebook create a Machine Learning `ML_model` to predict weeks 3-4 & 5-6 based on `S2S` weeks 3-4 & 5-6 forecasts and is compared to `CPC` observations for the [`s2s-ai-challenge`](https://s2s-ai-challenge.github.io/).
+
+%% Cell type:markdown id: tags:
+
+# Synopsis
+
+%% Cell type:markdown id: tags:
+
+## Method: `ML-based mean bias reduction`
+
+- calculate the ML-based bias from 2000-2019 deterministic ensemble mean forecast
+- remove that the ML-based bias from 2020 forecast deterministic ensemble mean forecast
+
+%% Cell type:markdown id: tags:
+
+## Data used
+
+type: renku datasets
+
+Training-input for Machine Learning model:
+- hindcasts of models:
+    - ECMWF: `ecmwf_hindcast-input_2000-2019_biweekly_deterministic.zarr`
+
+Forecast-input for Machine Learning model:
+- real-time 2020 forecasts of models:
+    - ECMWF: `ecmwf_forecast-input_2020_biweekly_deterministic.zarr`
+
+Compare Machine Learning model forecast against against ground truth:
+- `CPC` observations:
+    - `hindcast-like-observations_biweekly_deterministic.zarr`
+    - `forecast-like-observations_2020_biweekly_deterministic.zarr`
+
+%% Cell type:markdown id: tags:
+
+## Resources used
+for training, details in reproducibility
+
+- platform: renku
+- memory: 8 GB
+- processors: 2 CPU
+- storage required: 10 GB
+
+%% Cell type:markdown id: tags:
+
+## Safeguards
+
+All points have to be [x] checked. If not, your submission is invalid.
+
+Changes to the code after submissions are not possible, as the `commit` before the `tag` will be reviewed.
+(Only in exceptions and if previous effort in reproducibility can be found, it may be allowed to improve readability and reproducibility after November 1st 2021.)
+
+%% Cell type:markdown id: tags:
+
+### Safeguards to prevent [overfitting](https://en.wikipedia.org/wiki/Overfitting?wprov=sfti1)
+
+If the organizers suspect overfitting, your contribution can be disqualified.
+
+  - [x] We did not use 2020 observations in training (explicit overfitting and cheating)
+  - [x] We did not repeatedly verify my model on 2020 observations and incrementally improved my RPSS (implicit overfitting)
+  - [x] We provide RPSS scores for the training period with script `print_RPS_per_year`, see in section 6.3 `predict`.
+  - [x] We tried our best to prevent [data leakage](https://en.wikipedia.org/wiki/Leakage_(machine_learning)?wprov=sfti1).
+  - [x] We honor the `train-validate-test` [split principle](https://en.wikipedia.org/wiki/Training,_validation,_and_test_sets). This means that the hindcast data is split into `train` and `validate`, whereas `test` is withheld.
+  - [x] We did not use `test` explicitly in training or implicitly in incrementally adjusting parameters.
+  - [x] We considered [cross-validation](https://en.wikipedia.org/wiki/Cross-validation_(statistics)).
+
+%% Cell type:markdown id: tags:
+
+### Safeguards for Reproducibility
+Notebook/code must be independently reproducible from scratch by the organizers (after the competition), if not possible: no prize
+  - [x] All training data is publicly available (no pre-trained private neural networks, as they are not reproducible for us)
+  - [x] Code is well documented, readable and reproducible.
+  - [x] Code to reproduce training and predictions is preferred to run within a day on the described architecture. If the training takes longer than a day, please justify why this is needed. Please do not submit training piplelines, which take weeks to train.
+
+%% Cell type:markdown id: tags:
+
+# Todos to improve template
+
+This is just a demo.
+
+- [ ] use multiple predictor variables and two predicted variables
+- [ ] for both `lead_time`s in one go
+- [ ] consider seasonality, for now all `forecast_time` months are mixed
+- [ ] make probabilistic predictions with `category` dim, for now works deterministic
+
+%% Cell type:markdown id: tags:
+
+# Imports
+
+%% Cell type:code id: tags:
+
+``` python
+from tensorflow.keras.layers import Input, Dense, Flatten
+from tensorflow.keras.models import Sequential
+
+import matplotlib.pyplot as plt
+
+import xarray as xr
+xr.set_options(display_style='text')
+import numpy as np
+
+from dask.utils import format_bytes
+import xskillscore as xs
+```
+
+%% Cell type:markdown id: tags:
+
+# Get training data
+
+preprocessing of input data may be done in separate notebook/script
+
+%% Cell type:markdown id: tags:
+
+## Hindcast
+
+get weekly initialized hindcasts
+
+%% Cell type:code id: tags:
+
+``` python
+v='t2m'
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# preprocessed as renku dataset
+!renku storage pull ../data/ecmwf_hindcast-input_2000-2019_biweekly_deterministic.zarr
+```
+
+%% Output
+
+    [33m[1mWarning: [0mRun CLI commands only from project's root directory.
+    [0m
+
+%% Cell type:code id: tags:
+
+``` python
+hind_2000_2019 = xr.open_zarr("../data/ecmwf_hindcast-input_2000-2019_biweekly_deterministic.zarr", consolidated=True)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# preprocessed as renku dataset
+!renku storage pull ../data/ecmwf_forecast-input_2020_biweekly_deterministic.zarr
+```
+
+%% Output
+
+    [33m[1mWarning: [0mRun CLI commands only from project's root directory.
+    [0m
+
+%% Cell type:code id: tags:
+
+``` python
+fct_2020 = xr.open_zarr("../data/ecmwf_forecast-input_2020_biweekly_deterministic.zarr", consolidated=True)
+```
+
+%% Cell type:markdown id: tags:
+
+## Observations
+corresponding to hindcasts
+
+%% Cell type:code id: tags:
+
+``` python
+# preprocessed as renku dataset
+!renku storage pull ../data/hindcast-like-observations_2000-2019_biweekly_deterministic.zarr
+```
+
+%% Output
+
+    [33m[1mWarning: [0mRun CLI commands only from project's root directory.
+    [0m
+
+%% Cell type:code id: tags:
+
+``` python
+obs_2000_2019 = xr.open_zarr("../data/hindcast-like-observations_2000-2019_biweekly_deterministic.zarr", consolidated=True)#[v]
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# preprocessed as renku dataset
+!renku storage pull ../data/forecast-like-observations_2020_biweekly_deterministic.zarr
+```
+
+%% Output
+
+    [33m[1mWarning: [0mRun CLI commands only from project's root directory.
+    [0m
+
+%% Cell type:code id: tags:
+
+``` python
+obs_2020 = xr.open_zarr("../data/forecast-like-observations_2020_biweekly_deterministic.zarr", consolidated=True)#[v]
+```
+
+%% Cell type:markdown id: tags:
+
+# ML model
+
+%% Cell type:markdown id: tags:
+
+based on [Weatherbench](https://github.com/pangeo-data/WeatherBench/blob/master/quickstart.ipynb)
+
+%% Cell type:code id: tags:
+
+``` python
+# run once only and dont commit
+!git clone https://github.com/pangeo-data/WeatherBench/
+```
+
+%% Output
+
+    fatal: destination path 'WeatherBench' already exists and is not an empty directory.
+
+%% Cell type:code id: tags:
+
+``` python
+import sys
+sys.path.insert(1, 'WeatherBench')
+from WeatherBench.src.train_nn import DataGenerator, PeriodicConv2D, create_predictions
+import tensorflow.keras as keras
+```
+
+%% Cell type:code id: tags:
+
+``` python
+bs=32
+
+import numpy as np
+class DataGenerator(keras.utils.Sequence):
+    def __init__(self, fct, verif, lead_time, batch_size=bs, shuffle=True, load=True,
+                 mean=None, std=None):
+        """
+        Data generator for WeatherBench data.
+        Template from https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly
+
+        Args:
+            fct: forecasts from S2S models: xr.DataArray (xr.Dataset doesnt work properly)
+            verif: observations with same dimensionality (xr.Dataset doesnt work properly)
+            lead_time: Lead_time as in model
+            batch_size: Batch size
+            shuffle: bool. If True, data is shuffled.
+            load: bool. If True, datadet is loaded into RAM.
+            mean: If None, compute mean from data.
+            std: If None, compute standard deviation from data.
+
+        Todo:
+        - use number in a better way, now uses only ensemble mean forecast
+        - dont use .sel(lead_time=lead_time) to train over all lead_time at once
+        - be sensitive with forecast_time, pool a few around the weekofyear given
+        - use more variables as predictors
+        - predict more variables
+        """
+
+        if isinstance(fct, xr.Dataset):
+            print('convert fct to array')
+            fct = fct.to_array().transpose(...,'variable')
+            self.fct_dataset=True
+        else:
+            self.fct_dataset=False
+
+        if isinstance(verif, xr.Dataset):
+            print('convert verif to array')
+            verif = verif.to_array().transpose(...,'variable')
+            self.verif_dataset=True
+        else:
+            self.verif_dataset=False
+
+        #self.fct = fct
+        self.batch_size = batch_size
+        self.shuffle = shuffle
+        self.lead_time = lead_time
+
+        self.fct_data = fct.transpose('forecast_time', ...).sel(lead_time=lead_time)
+        self.fct_mean = self.fct_data.mean('forecast_time').compute() if mean is None else mean
+        self.fct_std = self.fct_data.std('forecast_time').compute() if std is None else std
+
+        self.verif_data = verif.transpose('forecast_time', ...).sel(lead_time=lead_time)
+        self.verif_mean = self.verif_data.mean('forecast_time').compute() if mean is None else mean
+        self.verif_std = self.verif_data.std('forecast_time').compute() if std is None else std
+
+        # Normalize
+        self.fct_data = (self.fct_data - self.fct_mean) / self.fct_std
+        self.verif_data = (self.verif_data - self.verif_mean) / self.verif_std
+
+        self.n_samples = self.fct_data.forecast_time.size
+        self.forecast_time = self.fct_data.forecast_time
+
+        self.on_epoch_end()
+
+        # For some weird reason calling .load() earlier messes up the mean and std computations
+        if load:
+            # print('Loading data into RAM')
+            self.fct_data.load()
+
+    def __len__(self):
+        'Denotes the number of batches per epoch'
+        return int(np.ceil(self.n_samples / self.batch_size))
+
+    def __getitem__(self, i):
+        'Generate one batch of data'
+        idxs = self.idxs[i * self.batch_size:(i + 1) * self.batch_size]
+        # got all nan if nans not masked
+        X = self.fct_data.isel(forecast_time=idxs).fillna(0.).values
+        y = self.verif_data.isel(forecast_time=idxs).fillna(0.).values
+        return X, y
+
+    def on_epoch_end(self):
+        'Updates indexes after each epoch'
+        self.idxs = np.arange(self.n_samples)
+        if self.shuffle == True:
+            np.random.shuffle(self.idxs)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# 2 bi-weekly `lead_time`: week 3-4
+lead = hind_2000_2019.isel(lead_time=0).lead_time
+
+lead
+```
+
+%% Output
+
+    <xarray.DataArray 'lead_time' ()>
+    array(1209600000000000, dtype='timedelta64[ns]')
+    Coordinates:
+        lead_time  timedelta64[ns] 14 days
+    Attributes:
+        aggregate:      The pd.Timedelta corresponds to the first day of a biweek...
+        description:    Forecast period is the time interval between the forecast...
+        long_name:      lead time
+        standard_name:  forecast_period
+        week34_t2m:     mean[14 days, 27 days]
+        week34_tp:      28 days minus 14 days
+        week56_t2m:     mean[28 days, 41 days]
+        week56_tp:      42 days minus 28 days
+
+%% Cell type:code id: tags:
+
+``` python
+# mask, needed?
+hind_2000_2019 = hind_2000_2019.where(obs_2000_2019.isel(forecast_time=0, lead_time=0,drop=True).notnull())
+```
+
+%% Cell type:markdown id: tags:
+
+## data prep: train, valid, test
+
+[Use the hindcast period to split train and valid.](https://en.wikipedia.org/wiki/Training,_validation,_and_test_sets) Do not use the 2020 data for testing!
+
+%% Cell type:code id: tags:
+
+``` python
+# time is the forecast_time
+time_train_start,time_train_end='2000','2017' # train
+time_valid_start,time_valid_end='2018','2019' # valid
+time_test = '2020'                            # test
+```
+
+%% Cell type:code id: tags:
+
+``` python
+dg_train = DataGenerator(
+    hind_2000_2019.mean('realization').sel(forecast_time=slice(time_train_start,time_train_end))[v],
+    obs_2000_2019.sel(forecast_time=slice(time_train_start,time_train_end))[v],
+    lead_time=lead, batch_size=bs, load=True)
+```
+
+%% Output
+
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+
+%% Cell type:code id: tags:
+
+``` python
+dg_valid = DataGenerator(
+    hind_2000_2019.mean('realization').sel(forecast_time=slice(time_valid_start,time_valid_end))[v],
+    obs_2000_2019.sel(forecast_time=slice(time_valid_start,time_valid_end))[v],
+    lead_time=lead, batch_size=bs, shuffle=False, load=True)
+```
+
+%% Output
+
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+
+%% Cell type:code id: tags:
+
+``` python
+# do not use, delete?
+dg_test = DataGenerator(
+    fct_2020.mean('realization').sel(forecast_time=time_test)[v],
+    obs_2020.sel(forecast_time=time_test)[v],
+    lead_time=lead, batch_size=bs, load=True, mean=dg_train.fct_mean, std=dg_train.fct_std, shuffle=False)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+X, y = dg_valid[0]
+X.shape, y.shape
+```
+
+%% Output
+
+    ((32, 121, 240), (32, 121, 240))
+
+%% Cell type:code id: tags:
+
+``` python
+# short look into training data: large biases
+# any problem from normalizing?
+# i=4
+# xr.DataArray(np.vstack([X[i],y[i]])).plot(yincrease=False, robust=True)
+```
+
+%% Cell type:markdown id: tags:
+
+## `fit`
+
+%% Cell type:code id: tags:
+
+``` python
+cnn = keras.models.Sequential([
+    PeriodicConv2D(filters=32, kernel_size=5, conv_kwargs={'activation':'relu'}, input_shape=(32, 64, 1)),
+    PeriodicConv2D(filters=1, kernel_size=5)
+])
+```
+
+%% Output
+
+    WARNING:tensorflow:AutoGraph could not transform <bound method PeriodicPadding2D.call of <WeatherBench.src.train_nn.PeriodicPadding2D object at 0x7f86042986a0>> and will run it as-is.
+    Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
+    Cause: module 'gast' has no attribute 'Index'
+    To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
+    WARNING: AutoGraph could not transform <bound method PeriodicPadding2D.call of <WeatherBench.src.train_nn.PeriodicPadding2D object at 0x7f86042986a0>> and will run it as-is.
+    Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
+    Cause: module 'gast' has no attribute 'Index'
+    To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
+
+%% Cell type:code id: tags:
+
+``` python
+cnn.summary()
+```
+
+%% Output
+
+    Model: "sequential"
+    _________________________________________________________________
+    Layer (type)                 Output Shape              Param #
+    =================================================================
+    periodic_conv2d (PeriodicCon (None, 32, 64, 32)        832
+    _________________________________________________________________
+    periodic_conv2d_1 (PeriodicC (None, 32, 64, 1)         801
+    =================================================================
+    Total params: 1,633
+    Trainable params: 1,633
+    Non-trainable params: 0
+    _________________________________________________________________
+
+%% Cell type:code id: tags:
+
+``` python
+cnn.compile(keras.optimizers.Adam(1e-4), 'mse')
+```
+
+%% Cell type:code id: tags:
+
+``` python
+import warnings
+warnings.simplefilter("ignore")
+```
+
+%% Cell type:code id: tags:
+
+``` python
+cnn.fit(dg_train, epochs=2, validation_data=dg_valid)
+```
+
+%% Output
+
+    Epoch 1/2
+    30/30 [==============================] - 58s 2s/step - loss: 0.1472 - val_loss: 0.0742
+    Epoch 2/2
+    30/30 [==============================] - 45s 1s/step - loss: 0.0712 - val_loss: 0.0545
+
+    <tensorflow.python.keras.callbacks.History at 0x7f865c2103d0>
+
+%% Cell type:markdown id: tags:
+
+## `predict`
+
+Create predictions and print `mean(variable, lead_time, longitude, weighted latitude)` RPSS for all years as calculated by `skill_by_year`.
+
+%% Cell type:code id: tags:
+
+``` python
+from scripts import add_valid_time_from_forecast_reference_time_and_lead_time
+
+def _create_predictions(model, dg, lead):
+    """Create non-iterative predictions"""
+    preds = model.predict(dg).squeeze()
+    # Unnormalize
+    preds = preds * dg.fct_std.values + dg.fct_mean.values
+    if dg.verif_dataset:
+        da = xr.DataArray(
+                    preds,
+                    dims=['forecast_time', 'latitude', 'longitude','variable'],
+                    coords={'forecast_time': dg.fct_data.forecast_time, 'latitude': dg.fct_data.latitude,
+                            'longitude': dg.fct_data.longitude},
+                ).to_dataset() # doesnt work yet
+    else:
+        da = xr.DataArray(
+                    preds,
+                    dims=['forecast_time', 'latitude', 'longitude'],
+                    coords={'forecast_time': dg.fct_data.forecast_time, 'latitude': dg.fct_data.latitude,
+                            'longitude': dg.fct_data.longitude},
+                )
+    da = da.assign_coords(lead_time=lead)
+    # da = add_valid_time_from_forecast_reference_time_and_lead_time(da)
+    return da
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# optionally masking the ocean when making probabilistic
+mask = obs_2020.std(['lead_time','forecast_time']).notnull()
+```
+
+%% Cell type:code id: tags:
+
+``` python
+from scripts import make_probabilistic
+```
+
+%% Cell type:code id: tags:
+
+``` python
+!renku storage pull ../data/hindcast-like-observations_2000-2019_biweekly_tercile-edges.nc
+```
+
+%% Output
+
+    [33m[1mWarning: [0mRun CLI commands only from project's root directory.
+    [0m
+
+%% Cell type:code id: tags:
+
+``` python
+cache_path='../data'
+tercile_file = f'{cache_path}/hindcast-like-observations_2000-2019_biweekly_tercile-edges.nc'
+tercile_edges = xr.open_dataset(tercile_file)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# this is not useful but results have expected dimensions
+# actually train for each lead_time
+
+def create_predictions(cnn, fct, obs, time):
+    preds_test=[]
+    for lead in fct.lead_time:
+        dg = DataGenerator(fct.mean('realization').sel(forecast_time=time)[v],
+                           obs.sel(forecast_time=time)[v],
+                           lead_time=lead, batch_size=bs, mean=dg_train.fct_mean, std=dg_train.fct_std, shuffle=False)
+        preds_test.append(_create_predictions(cnn, dg, lead))
+    preds_test = xr.concat(preds_test, 'lead_time')
+    preds_test['lead_time'] = fct.lead_time
+    # add valid_time coord
+    preds_test = add_valid_time_from_forecast_reference_time_and_lead_time(preds_test)
+    preds_test = preds_test.to_dataset(name=v)
+    # add fake var
+    preds_test['tp'] = preds_test['t2m']
+    # make probabilistic
+    preds_test = make_probabilistic(preds_test.expand_dims('realization'), tercile_edges, mask=mask)
+    return preds_test
+```
+
+%% Cell type:markdown id: tags:
+
+### `predict` training period in-sample
+
+%% Cell type:code id: tags:
+
+``` python
+!renku storage pull ../data/forecast-like-observations_2020_biweekly_terciled.nc
+```
+
+%% Output
+
+    [33m[1mWarning: [0mRun CLI commands only from project's root directory.
+    [0m
+
+%% Cell type:code id: tags:
+
+``` python
+!renku storage pull ../data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr
+```
+
+%% Output
+
+    [33m[1mWarning: [0mRun CLI commands only from project's root directory.
+    [0m
+
+%% Cell type:code id: tags:
+
+``` python
+from scripts import skill_by_year
+import os
+if os.environ['HOME'] == '/home/jovyan':
+    import pandas as pd
+    # assume on renku with small memory
+    step = 2
+    skill_list = []
+    for year in np.arange(int(time_train_start), int(time_train_end) -1, step): # loop over years to consume less memory on renku
+        preds_is = create_predictions(cnn, hind_2000_2019, obs_2000_2019, time=slice(str(year), str(year+step-1))).compute()
+        skill_list.append(skill_by_year(preds_is))
+    skill = pd.concat(skill_list)
+else: # with larger memory, simply do
+    preds_is = create_predictions(cnn, hind_2000_2019, obs_2000_2019, time=slice(time_train_start, time_train_end))
+    skill = skill_by_year(preds_is)
+skill
+```
+
+%% Output
+
+              RPSS
+    year
+    2000 -0.862483
+    2001 -1.015485
+    2002 -1.101022
+    2003 -1.032647
+    2004 -1.056348
+    2005 -1.165675
+    2006 -1.057217
+    2007 -1.170849
+    2008 -1.049785
+    2009 -1.169108
+    2010 -1.130845
+    2011 -1.052670
+    2012 -1.126449
+    2013 -1.126930
+    2014 -1.095896
+    2015 -1.117486
+
+%% Cell type:markdown id: tags:
+
+### `predict` validation period out-of-sample
+
+%% Cell type:code id: tags:
+
+``` python
+preds_os = create_predictions(cnn, hind_2000_2019, obs_2000_2019, time=slice(time_valid_start, time_valid_end))
+
+skill_by_year(preds_os)
+```
+
+%% Output
+
+              RPSS
+    year
+    2018 -1.099744
+    2019 -1.172401
+
+%% Cell type:markdown id: tags:
+
+### `predict` test
+
+%% Cell type:code id: tags:
+
+``` python
+preds_test = create_predictions(cnn, fct_2020, obs_2020, time=time_test)
+
+skill_by_year(preds_test)
+```
+
+%% Output
+
+              RPSS
+    year
+    2020 -1.076834
+
+%% Cell type:markdown id: tags:
+
+# Submission
+
+%% Cell type:code id: tags:
+
+``` python
+from scripts import assert_predictions_2020
+assert_predictions_2020(preds_test)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+preds_test.to_netcdf('../submissions/ML_prediction_2020.nc')
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# !git add ../submissions/ML_prediction_2020.nc
+# !git add ML_train_and_prediction.ipynb
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# !git commit -m "template_test commit message" # whatever message you want
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# !git tag "submission-template_test-0.0.1" # if this is to be checked by scorer, only the last submitted==tagged version will be considered
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# !git push --tags
+```
+
+%% Cell type:code id: tags:
+
+``` python
+```
+
+%% Cell type:markdown id: tags:
+
+# Reproducibility
+
+%% Cell type:markdown id: tags:
+
+## memory
+
+%% Cell type:code id: tags:
+
+``` python
+# https://phoenixnap.com/kb/linux-commands-check-memory-usage
+!free -g
+```
+
+%% Output
+
+                  total        used        free      shared  buff/cache   available
+    Mem:             31           7          11           0          12          24
+    Swap:             0           0           0
+
+%% Cell type:markdown id: tags:
+
+## CPU
+
+%% Cell type:code id: tags:
+
+``` python
+!lscpu
+```
+
+%% Output
+
+    Architecture:                    x86_64
+    CPU op-mode(s):                  32-bit, 64-bit
+    Byte Order:                      Little Endian
+    Address sizes:                   40 bits physical, 48 bits virtual
+    CPU(s):                          8
+    On-line CPU(s) list:             0-7
+    Thread(s) per core:              1
+    Core(s) per socket:              1
+    Socket(s):                       8
+    NUMA node(s):                    1
+    Vendor ID:                       GenuineIntel
+    CPU family:                      6
+    Model:                           85
+    Model name:                      Intel Xeon Processor (Skylake, IBRS)
+    Stepping:                        4
+    CPU MHz:                         2095.078
+    BogoMIPS:                        4190.15
+    Virtualization:                  VT-x
+    Hypervisor vendor:               KVM
+    Virtualization type:             full
+    L1d cache:                       256 KiB
+    L1i cache:                       256 KiB
+    L2 cache:                        32 MiB
+    L3 cache:                        128 MiB
+    NUMA node0 CPU(s):               0-7
+    Vulnerability Itlb multihit:     KVM: Mitigation: Split huge pages
+    Vulnerability L1tf:              Mitigation; PTE Inversion; VMX conditional cach
+                                     e flushes, SMT disabled
+    Vulnerability Mds:               Vulnerable: Clear CPU buffers attempted, no mic
+                                     rocode; SMT Host state unknown
+    Vulnerability Meltdown:          Mitigation; PTI
+    Vulnerability Spec store bypass: Vulnerable
+    Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user
+                                      pointer sanitization
+    Vulnerability Spectre v2:        Mitigation; Full generic retpoline, IBPB condit
+                                     ional, IBRS_FW, STIBP disabled, RSB filling
+    Vulnerability Srbds:             Not affected
+    Vulnerability Tsx async abort:   Not affected
+    Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtr
+                                     r pge mca cmov pat pse36 clflush mmx fxsr sse s
+                                     se2 syscall nx pdpe1gb rdtscp lm constant_tsc r
+                                     ep_good nopl xtopology cpuid tsc_known_freq pni
+                                      pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_
+                                     2 x2apic movbe popcnt tsc_deadline_timer aes xs
+                                     ave avx f16c rdrand hypervisor lahf_lm abm 3dno
+                                     wprefetch cpuid_fault invpcid_single pti ibrs i
+                                     bpb tpr_shadow vnmi flexpriority ept vpid ept_a
+                                     d fsgsbase bmi1 avx2 smep bmi2 erms invpcid avx
+                                     512f avx512dq rdseed adx smap clwb avx512cd avx
+                                     512bw avx512vl xsaveopt xsavec xgetbv1 arat pku
+                                      ospke
+
+%% Cell type:markdown id: tags:
+
+## software
+
+%% Cell type:code id: tags:
+
+``` python
+!conda list
+```
+
+%% Output
+
+    # packages in environment at /opt/conda:
+    #
+    # Name                    Version                   Build  Channel
+    _libgcc_mutex             0.1                 conda_forge    conda-forge
+    _openmp_mutex             4.5                       1_gnu    conda-forge
+    _pytorch_select           0.1                       cpu_0    defaults
+    _tflow_select             2.3.0                       mkl    defaults
+    absl-py                   0.13.0           py38h06a4308_0    defaults
+    aiobotocore               1.4.1              pyhd3eb1b0_0    defaults
+    aiohttp                   3.7.4.post0      py38h7f8727e_2    defaults
+    aioitertools              0.7.1              pyhd3eb1b0_0    defaults
+    alembic                   1.4.3              pyh9f0ad1d_0    conda-forge
+    ansiwrap                  0.8.4                    pypi_0    pypi
+    appdirs                   1.4.4                    pypi_0    pypi
+    argcomplete               1.12.3                   pypi_0    pypi
+    argon2-cffi               20.1.0           py38h497a2fe_2    conda-forge
+    argparse                  1.4.0                    pypi_0    pypi
+    asciitree                 0.3.3                      py_2    defaults
+    astor                     0.8.1            py38h06a4308_0    defaults
+    astunparse                1.6.3                      py_0    defaults
+    async-timeout             3.0.1                    pypi_0    pypi
+    async_generator           1.10                       py_0    conda-forge
+    attrs                     21.2.0                   pypi_0    pypi
+    backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
+    backports                 1.0                        py_2    conda-forge
+    backports.functools_lru_cache 1.6.1                      py_0    conda-forge
+    bagit                     1.8.1                    pypi_0    pypi
+    beautifulsoup4            4.10.0             pyh06a4308_0    defaults
+    binutils_impl_linux-64    2.35.1               h193b22a_1    conda-forge
+    binutils_linux-64         2.35                h67ddf6f_30    conda-forge
+    black                     20.8b1                   pypi_0    pypi
+    blas                      1.0                         mkl    defaults
+    bleach                    3.2.1              pyh9f0ad1d_0    conda-forge
+    blinker                   1.4                        py_1    conda-forge
+    bokeh                     2.3.3            py38h06a4308_0    defaults
+    botocore                  1.20.106           pyhd3eb1b0_0    defaults
+    bottleneck                1.3.2            py38heb32a55_1    defaults
+    bracex                    2.1.1                    pypi_0    pypi
+    branca                    0.3.1                    pypi_0    pypi
+    brotli                    1.0.9                he6710b0_2    defaults
+    brotlipy                  0.7.0           py38h497a2fe_1001    conda-forge
+    bzip2                     1.0.8                h7f98852_4    conda-forge
+    c-ares                    1.17.1               h36c2ea0_0    conda-forge
+    ca-certificates           2021.7.5             h06a4308_1    defaults
+    cachecontrol              0.12.6                   pypi_0    pypi
+    cachetools                4.2.4                    pypi_0    pypi
+    calamus                   0.3.12                   pypi_0    pypi
+    cdsapi                    0.5.1                    pypi_0    pypi
+    certifi                   2021.5.30                pypi_0    pypi
+    certipy                   0.1.3                      py_0    conda-forge
+    cffi                      1.14.6                   pypi_0    pypi
+    cfgrib                    0.9.9.0            pyhd8ed1ab_1    conda-forge
+    cftime                    1.5.0            py38h6323ea4_0    defaults
+    chardet                   3.0.4                    pypi_0    pypi
+    click                     7.1.2                    pypi_0    pypi
+    click-completion          0.5.2                    pypi_0    pypi
+    click-option-group        0.5.3                    pypi_0    pypi
+    click-plugins             1.1.1                    pypi_0    pypi
+    climetlab                 0.8.31                   pypi_0    pypi
+    climetlab-s2s-ai-challenge 0.8.0                    pypi_0    pypi
+    cloudpickle               2.0.0              pyhd3eb1b0_0    defaults
+    colorama                  0.4.4                    pypi_0    pypi
+    coloredlogs               15.0.1                   pypi_0    pypi
+    commonmark                0.9.1                    pypi_0    pypi
+    conda                     4.9.2            py38h578d9bd_0    conda-forge
+    conda-package-handling    1.7.2            py38h8df0ef7_0    conda-forge
+    configargparse            1.5.2                    pypi_0    pypi
+    configurable-http-proxy   1.3.0                         0    conda-forge
+    coverage                  5.5              py38h27cfd23_2    defaults
+    cryptography              3.4.8                    pypi_0    pypi
+    curl                      7.71.1               he644dc0_8    conda-forge
+    cwlgen                    0.4.2                    pypi_0    pypi
+    cwltool                   3.1.20211004060744          pypi_0    pypi
+    cycler                    0.10.0                   py38_0    defaults
+    cython                    0.29.24          py38h295c915_0    defaults
+    cytoolz                   0.11.0           py38h7b6447c_0    defaults
+    dask                      2021.8.1           pyhd3eb1b0_0    defaults
+    dask-core                 2021.8.1           pyhd3eb1b0_0    defaults
+    dataclasses               0.8                pyh6d0b6a4_7    defaults
+    decorator                 4.4.2                      py_0    conda-forge
+    defusedxml                0.6.0                      py_0    conda-forge
+    distributed               2021.8.1         py38h06a4308_0    defaults
+    distro                    1.5.0                    pypi_0    pypi
+    docopt                    0.6.2            py38h06a4308_0    defaults
+    eccodes                   2.21.0               ha0e6eb6_0    conda-forge
+    ecmwf-api-client          1.6.1                    pypi_0    pypi
+    ecmwflibs                 0.3.14                   pypi_0    pypi
+    entrypoints               0.3             pyhd8ed1ab_1003    conda-forge
+    environ-config            21.2.0                   pypi_0    pypi
+    fasteners                 0.16.3             pyhd3eb1b0_0    defaults
+    filelock                  3.0.12                   pypi_0    pypi
+    findlibs                  0.0.2                    pypi_0    pypi
+    fonttools                 4.25.0             pyhd3eb1b0_0    defaults
+    freetype                  2.10.4               h5ab3b9f_0    defaults
+    frozendict                2.0.6                    pypi_0    pypi
+    fsspec                    2021.7.0           pyhd3eb1b0_0    defaults
+    gast                      0.4.0              pyhd3eb1b0_0    defaults
+    gcc_impl_linux-64         9.3.0               h70c0ae5_18    conda-forge
+    gcc_linux-64              9.3.0               hf25ea35_30    conda-forge
+    gitdb                     4.0.7                    pypi_0    pypi
+    gitpython                 3.1.14                   pypi_0    pypi
+    google-auth               1.33.0             pyhd3eb1b0_0    defaults
+    google-auth-oauthlib      0.4.4              pyhd3eb1b0_0    defaults
+    google-pasta              0.2.0              pyhd3eb1b0_0    defaults
+    grpcio                    1.36.1           py38h2157cd5_1    defaults
+    gxx_impl_linux-64         9.3.0               hd87eabc_18    conda-forge
+    gxx_linux-64              9.3.0               h3fbe746_30    conda-forge
+    h5netcdf                  0.11.0             pyhd8ed1ab_0    conda-forge
+    h5py                      2.10.0           py38hd6299e0_1    defaults
+    hdf4                      4.2.13               h3ca952b_2    defaults
+    hdf5                      1.10.6          nompi_h6a2412b_1114    conda-forge
+    heapdict                  1.0.1              pyhd3eb1b0_0    defaults
+    humanfriendly             10.0                     pypi_0    pypi
+    humanize                  3.7.1                    pypi_0    pypi
+    icu                       68.1                 h58526e2_0    conda-forge
+    idna                      2.10               pyh9f0ad1d_0    conda-forge
+    importlib-metadata        3.4.0            py38h578d9bd_0    conda-forge
+    importlib_metadata        3.4.0                hd8ed1ab_0    conda-forge
+    intake                    0.6.3              pyhd3eb1b0_0    defaults
+    intake-xarray             0.5.0              pyhd3eb1b0_0    defaults
+    intel-openmp              2019.4                      243    defaults
+    ipykernel                 5.4.2            py38h81c977d_0    conda-forge
+    ipython                   7.19.0           py38h81c977d_2    conda-forge
+    ipython_genutils          0.2.0                      py_1    conda-forge
+    isodate                   0.6.0                    pypi_0    pypi
+    jasper                    1.900.1              hd497a04_4    defaults
+    jedi                      0.17.2           py38h578d9bd_1    conda-forge
+    jellyfish                 0.8.8                    pypi_0    pypi
+    jinja2                    3.0.1                    pypi_0    pypi
+    jmespath                  0.10.0             pyhd3eb1b0_0    defaults
+    joblib                    1.0.1              pyhd3eb1b0_0    defaults
+    jpeg                      9d                   h7f8727e_0    defaults
+    json5                     0.9.5              pyh9f0ad1d_0    conda-forge
+    jsonschema                3.2.0                      py_2    conda-forge
+    jupyter-server-proxy      1.6.0                    pypi_0    pypi
+    jupyter_client            6.1.11             pyhd8ed1ab_1    conda-forge
+    jupyter_core              4.7.0            py38h578d9bd_0    conda-forge
+    jupyter_telemetry         0.1.0              pyhd8ed1ab_1    conda-forge
+    jupyterhub                1.2.2                    pypi_0    pypi
+    jupyterlab                2.2.9                      py_0    conda-forge
+    jupyterlab-git            0.23.3                   pypi_0    pypi
+    jupyterlab_pygments       0.1.2              pyh9f0ad1d_0    conda-forge
+    jupyterlab_server         1.2.0                      py_0    conda-forge
+    keras-preprocessing       1.1.2              pyhd3eb1b0_0    defaults
+    kernel-headers_linux-64   2.6.32              h77966d4_13    conda-forge
+    kiwisolver                1.3.1            py38h2531618_0    defaults
+    krb5                      1.17.2               h926e7f8_0    conda-forge
+    lazy-object-proxy         1.6.0                    pypi_0    pypi
+    lcms2                     2.12                 h3be6417_0    defaults
+    ld_impl_linux-64          2.35.1               hea4e1c9_1    conda-forge
+    libaec                    1.0.4                he6710b0_1    defaults
+    libblas                   3.9.0           1_h86c2bf4_netlib    conda-forge
+    libcblas                  3.9.0           5_h92ddd45_netlib    conda-forge
+    libcurl                   7.71.1               hcdd3856_8    conda-forge
+    libedit                   3.1.20191231         he28a2e2_2    conda-forge
+    libev                     4.33                 h516909a_1    conda-forge
+    libffi                    3.3                  h58526e2_2    conda-forge
+    libgcc-devel_linux-64     9.3.0               h7864c58_18    conda-forge
+    libgcc-ng                 9.3.0               h2828fa1_18    conda-forge
+    libgfortran-ng            9.3.0               ha5ec8a7_17    defaults
+    libgfortran5              9.3.0               ha5ec8a7_17    defaults
+    libgomp                   9.3.0               h2828fa1_18    conda-forge
+    liblapack                 3.9.0           5_h92ddd45_netlib    conda-forge
+    libllvm10                 10.0.1               hbcb73fb_5    defaults
+    libmklml                  2019.0.5                      0    defaults
+    libnetcdf                 4.7.4           nompi_h56d31a8_107    conda-forge
+    libnghttp2                1.41.0               h8cfc5f6_2    conda-forge
+    libpng                    1.6.37               hbc83047_0    defaults
+    libprotobuf               3.17.2               h4ff587b_1    defaults
+    libsodium                 1.0.18               h36c2ea0_1    conda-forge
+    libssh2                   1.9.0                hab1572f_5    conda-forge
+    libstdcxx-devel_linux-64  9.3.0               hb016644_18    conda-forge
+    libstdcxx-ng              9.3.0               h6de172a_18    conda-forge
+    libtiff                   4.2.0                h85742a9_0    defaults
+    libuv                     1.40.0               h7f98852_0    conda-forge
+    libwebp-base              1.2.0                h27cfd23_0    defaults
+    llvmlite                  0.36.0           py38h612dafd_4    defaults
+    locket                    0.2.1            py38h06a4308_1    defaults
+    lockfile                  0.12.2                   pypi_0    pypi
+    lxml                      4.6.3                    pypi_0    pypi
+    lz4-c                     1.9.3                h295c915_1    defaults
+    magics                    1.5.6                    pypi_0    pypi
+    mako                      1.1.4              pyh44b312d_0    conda-forge
+    markdown                  3.3.4            py38h06a4308_0    defaults
+    markupsafe                2.0.1                    pypi_0    pypi
+    marshmallow               3.13.0                   pypi_0    pypi
+    matplotlib-base           3.4.2            py38hab158f2_0    defaults
+    mistune                   0.8.4           py38h497a2fe_1003    conda-forge
+    mkl                       2020.2                      256    defaults
+    mkl-service               2.3.0            py38he904b0f_0    defaults
+    mkl_fft                   1.3.0            py38h54f3939_0    defaults
+    mkl_random                1.1.1            py38h0573a6f_0    defaults
+    msgpack-python            1.0.2            py38hff7bd54_1    defaults
+    multidict                 5.1.0            py38h27cfd23_2    defaults
+    munkres                   1.1.4                      py_0    defaults
+    mypy-extensions           0.4.3                    pypi_0    pypi
+    nbclient                  0.5.0                    pypi_0    pypi
+    nbconvert                 6.0.7            py38h578d9bd_3    conda-forge
+    nbdime                    2.1.0                    pypi_0    pypi
+    nbformat                  5.1.2              pyhd8ed1ab_1    conda-forge
+    nbresuse                  0.4.0                    pypi_0    pypi
+    nc-time-axis              1.3.1              pyhd8ed1ab_2    conda-forge
+    ncurses                   6.2                  h58526e2_4    conda-forge
+    ndg-httpsclient           0.5.1                    pypi_0    pypi
+    nest-asyncio              1.4.3              pyhd8ed1ab_0    conda-forge
+    netcdf4                   1.5.4                    pypi_0    pypi
+    networkx                  2.6.3                    pypi_0    pypi
+    ninja                     1.10.2               hff7bd54_1    defaults
+    nodejs                    15.3.0               h25f6087_0    conda-forge
+    notebook                  6.2.0            py38h578d9bd_0    conda-forge
+    numba                     0.53.1           py38ha9443f7_0    defaults
+    numcodecs                 0.8.0            py38h2531618_0    defaults
+    numexpr                   2.7.3            py38hb2eb853_0    defaults
+    numpy                     1.19.2           py38h54aff64_0    defaults
+    numpy-base                1.19.2           py38hfa32c7d_0    defaults
+    oauthlib                  3.0.1                      py_0    conda-forge
+    olefile                   0.46               pyhd3eb1b0_0    defaults
+    openjpeg                  2.4.0                h3ad879b_0    defaults
+    openssl                   1.1.1l               h7f8727e_0    defaults
+    opt_einsum                3.3.0              pyhd3eb1b0_1    defaults
+    owlrl                     5.2.3                    pypi_0    pypi
+    packaging                 20.8               pyhd3deb0d_0    conda-forge
+    pamela                    1.0.0                      py_0    conda-forge
+    pandas                    1.3.2            py38h8c16a72_0    defaults
+    pandoc                    2.11.3.2             h7f98852_0    conda-forge
+    pandocfilters             1.4.2                      py_1    conda-forge
+    papermill                 2.3.1                    pypi_0    pypi
+    parso                     0.7.1              pyh9f0ad1d_0    conda-forge
+    partd                     1.2.0              pyhd3eb1b0_0    defaults
+    pathspec                  0.9.0                    pypi_0    pypi
+    patool                    1.12                     pypi_0    pypi
+    pdbufr                    0.9.0                    pypi_0    pypi
+    pexpect                   4.8.0              pyh9f0ad1d_2    conda-forge
+    pickleshare               0.7.5                   py_1003    conda-forge
+    pillow                    8.3.1            py38h2c7a002_0    defaults
+    pip                       21.0.1                   pypi_0    pypi
+    pipx                      0.16.1.0                 pypi_0    pypi
+    pluggy                    0.13.1                   pypi_0    pypi
+    portalocker               2.3.2                    pypi_0    pypi
+    powerline-shell           0.7.0                    pypi_0    pypi
+    prometheus_client         0.9.0              pyhd3deb0d_0    conda-forge
+    prompt-toolkit            3.0.10             pyha770c72_0    conda-forge
+    properscoring             0.1                        py_0    conda-forge
+    protobuf                  3.17.2           py38h295c915_0    defaults
+    prov                      1.5.1                    pypi_0    pypi
+    psutil                    5.8.0            py38h27cfd23_1    defaults
+    ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
+    pyasn1                    0.4.8              pyhd3eb1b0_0    defaults
+    pyasn1-modules            0.2.8                      py_0    defaults
+    pycosat                   0.6.3           py38h497a2fe_1006    conda-forge
+    pycparser                 2.20               pyh9f0ad1d_2    conda-forge
+    pycurl                    7.43.0.6         py38h996a351_1    conda-forge
+    pydap                     3.2.2           pyh9f0ad1d_1001    conda-forge
+    pydot                     1.4.2                    pypi_0    pypi
+    pygments                  2.10.0                   pypi_0    pypi
+    pyjwt                     2.1.0                    pypi_0    pypi
+    pyld                      2.0.3                    pypi_0    pypi
+    pyodc                     1.1.1                    pypi_0    pypi
+    pyopenssl                 20.0.1             pyhd8ed1ab_0    conda-forge
+    pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
+    pyrsistent                0.17.3           py38h497a2fe_2    conda-forge
+    pyshacl                   0.17.0.post1             pypi_0    pypi
+    pysocks                   1.7.1            py38h578d9bd_3    conda-forge
+    python                    3.8.6           hffdb5ce_4_cpython    conda-forge
+    python-dateutil           2.8.1                      py_0    conda-forge
+    python-eccodes            2021.03.0        py38hb5d20a5_1    conda-forge
+    python-editor             1.0.4                    pypi_0    pypi
+    python-flatbuffers        1.12               pyhd3eb1b0_0    defaults
+    python-json-logger        2.0.1              pyh9f0ad1d_0    conda-forge
+    python-snappy             0.6.0            py38h2531618_3    defaults
+    python_abi                3.8                      1_cp38    conda-forge
+    pytorch                   1.8.1           cpu_py38h60491be_0    defaults
+    pytz                      2021.1             pyhd3eb1b0_0    defaults
+    pyyaml                    5.4.1                    pypi_0    pypi
+    pyzmq                     21.0.1           py38h3d7ac18_0    conda-forge
+    rdflib                    6.0.1                    pypi_0    pypi
+    rdflib-jsonld             0.5.0                    pypi_0    pypi
+    readline                  8.0                  he28a2e2_2    conda-forge
+    regex                     2021.4.4                 pypi_0    pypi
+    renku                     0.16.2                   pypi_0    pypi
+    requests                  2.24.0                   pypi_0    pypi
+    requests-oauthlib         1.3.0                      py_0    defaults
+    rich                      10.3.0                   pypi_0    pypi
+    rsa                       4.7.2              pyhd3eb1b0_1    defaults
+    ruamel-yaml               0.16.5                   pypi_0    pypi
+    ruamel.yaml.clib          0.2.2            py38h497a2fe_2    conda-forge
+    ruamel_yaml               0.15.80         py38h497a2fe_1003    conda-forge
+    s3fs                      2021.7.0           pyhd3eb1b0_0    defaults
+    schema-salad              8.2.20210918131710          pypi_0    pypi
+    scikit-learn              0.24.2           py38ha9443f7_0    defaults
+    scipy                     1.7.0            py38h7b17777_1    conda-forge
+    send2trash                1.5.0                      py_0    conda-forge
+    setuptools                58.2.0                   pypi_0    pypi
+    setuptools-scm            6.0.1                    pypi_0    pypi
+    shellescape               3.8.1                    pypi_0    pypi
+    shellingham               1.4.0                    pypi_0    pypi
+    simpervisor               0.4                      pypi_0    pypi
+    six                       1.16.0                   pypi_0    pypi
+    smmap                     4.0.0                    pypi_0    pypi
+    snappy                    1.1.8                he6710b0_0    defaults
+    sortedcontainers          2.4.0              pyhd3eb1b0_0    defaults
+    soupsieve                 2.2.1              pyhd3eb1b0_0    defaults
+    sqlalchemy                1.3.22           py38h497a2fe_1    conda-forge
+    sqlite                    3.34.0               h74cdb3f_0    conda-forge
+    sysroot_linux-64          2.12                h77966d4_13    conda-forge
+    tabulate                  0.8.9                    pypi_0    pypi
+    tbb                       2020.3               hfd86e86_0    defaults
+    tblib                     1.7.0              pyhd3eb1b0_0    defaults
+    tenacity                  7.0.0                    pypi_0    pypi
+    tensorboard               2.4.0              pyhc547734_0    defaults
+    tensorboard-plugin-wit    1.6.0                      py_0    defaults
+    tensorflow                2.4.1           mkl_py38hb2083e0_0    defaults
+    tensorflow-base           2.4.1           mkl_py38h43e0292_0    defaults
+    tensorflow-estimator      2.6.0              pyh7b7c402_0    defaults
+    termcolor                 1.1.0            py38h06a4308_1    defaults
+    terminado                 0.9.2            py38h578d9bd_0    conda-forge
+    testpath                  0.4.4                      py_0    conda-forge
+    textwrap3                 0.9.2                    pypi_0    pypi
+    threadpoolctl             2.2.0              pyh0d69192_0    defaults
+    tini                      0.18.0            h14c3975_1001    conda-forge
+    tk                        8.6.10               h21135ba_1    conda-forge
+    toml                      0.10.2                   pypi_0    pypi
+    toolz                     0.11.1             pyhd3eb1b0_0    defaults
+    tornado                   6.1              py38h497a2fe_1    conda-forge
+    tqdm                      4.60.0                   pypi_0    pypi
+    traitlets                 5.0.5                      py_0    conda-forge
+    typed-ast                 1.4.2                    pypi_0    pypi
+    typing-extensions         3.7.4.3                  pypi_0    pypi
+    typing_extensions         3.10.0.2           pyh06a4308_0    defaults
+    urllib3                   1.25.11                  pypi_0    pypi
+    userpath                  1.4.2                    pypi_0    pypi
+    wcmatch                   8.2                      pypi_0    pypi
+    wcwidth                   0.2.5              pyh9f0ad1d_2    conda-forge
+    webencodings              0.5.1                      py_1    conda-forge
+    webob                     1.8.7              pyhd3eb1b0_0    defaults
+    werkzeug                  2.0.1              pyhd3eb1b0_0    defaults
+    wheel                     0.36.2             pyhd3deb0d_0    conda-forge
+    wrapt                     1.12.1           py38h7b6447c_1    defaults
+    xarray                    0.19.0             pyhd3eb1b0_1    defaults
+    xhistogram                0.3.0              pyhd8ed1ab_0    conda-forge
+    xskillscore               0.0.23             pyhd8ed1ab_0    conda-forge
+    xz                        5.2.5                h516909a_1    conda-forge
+    yagup                     0.1.1                    pypi_0    pypi
+    yaml                      0.2.5                h516909a_0    conda-forge
+    yarl                      1.6.3            py38h27cfd23_0    defaults
+    zarr                      2.8.1              pyhd3eb1b0_0    defaults
+    zeromq                    4.3.3                h58526e2_3    conda-forge
+    zict                      2.0.0              pyhd3eb1b0_0    defaults
+    zipp                      3.4.0                      py_0    conda-forge
+    zlib                      1.2.11            h516909a_1010    conda-forge
+    zstd                      1.4.9                haebb681_0    defaults
+
+%% Cell type:code id: tags:
+
+``` python
+```
+%% Cell type:markdown id: tags:
+
+# Train ML model to correct predictions of week 3-4 & 5-6
+
+This notebook create a Machine Learning `ML_model` to predict weeks 3-4 & 5-6 based on `S2S` weeks 3-4 & 5-6 forecasts and is compared to `CPC` observations for the [`s2s-ai-challenge`](https://s2s-ai-challenge.github.io/).
+
+%% Cell type:markdown id: tags:
+
+# Synopsis
+
+%% Cell type:markdown id: tags:
+
+## Method: `ML-based mean bias reduction`
+
+- calculate the ML-based bias from 2000-2019 deterministic ensemble mean forecast
+- remove that the ML-based bias from 2020 forecast deterministic ensemble mean forecast
+
+%% Cell type:markdown id: tags:
+
+## Data used
+
+type: renku datasets
+
+Training-input for Machine Learning model:
+- hindcasts of models:
+    - ECMWF: `ecmwf_hindcast-input_2000-2019_biweekly_deterministic.zarr`
+
+Forecast-input for Machine Learning model:
+- real-time 2020 forecasts of models:
+    - ECMWF: `ecmwf_forecast-input_2020_biweekly_deterministic.zarr`
+
+Compare Machine Learning model forecast against against ground truth:
+- `CPC` observations:
+    - `hindcast-like-observations_biweekly_deterministic.zarr`
+    - `forecast-like-observations_2020_biweekly_deterministic.zarr`
+
+%% Cell type:markdown id: tags:
+
+## Resources used
+for training, details in reproducibility
+
+- platform: renku
+- memory: 8 GB
+- processors: 2 CPU
+- storage required: 10 GB
+
+%% Cell type:markdown id: tags:
+
+## Safeguards
+
+All points have to be [x] checked. If not, your submission is invalid.
+
+Changes to the code after submissions are not possible, as the `commit` before the `tag` will be reviewed.
+(Only in exceptions and if previous effort in reproducibility can be found, it may be allowed to improve readability and reproducibility after November 1st 2021.)
+
+%% Cell type:markdown id: tags:
+
+### Safeguards to prevent [overfitting](https://en.wikipedia.org/wiki/Overfitting?wprov=sfti1)
+
+If the organizers suspect overfitting, your contribution can be disqualified.
+
+  - [x] We did not use 2020 observations in training (explicit overfitting and cheating)
+  - [x] We did not repeatedly verify my model on 2020 observations and incrementally improved my RPSS (implicit overfitting)
+  - [x] We provide RPSS scores for the training period with script `print_RPS_per_year`, see in section 6.3 `predict`.
+  - [x] We tried our best to prevent [data leakage](https://en.wikipedia.org/wiki/Leakage_(machine_learning)?wprov=sfti1).
+  - [x] We honor the `train-validate-test` [split principle](https://en.wikipedia.org/wiki/Training,_validation,_and_test_sets). This means that the hindcast data is split into `train` and `validate`, whereas `test` is withheld.
+  - [x] We did not use `test` explicitly in training or implicitly in incrementally adjusting parameters.
+  - [x] We considered [cross-validation](https://en.wikipedia.org/wiki/Cross-validation_(statistics)).
+
+%% Cell type:markdown id: tags:
+
+### Safeguards for Reproducibility
+Notebook/code must be independently reproducible from scratch by the organizers (after the competition), if not possible: no prize
+  - [x] All training data is publicly available (no pre-trained private neural networks, as they are not reproducible for us)
+  - [x] Code is well documented, readable and reproducible.
+  - [x] Code to reproduce training and predictions is preferred to run within a day on the described architecture. If the training takes longer than a day, please justify why this is needed. Please do not submit training piplelines, which take weeks to train.
+
+%% Cell type:markdown id: tags:
+
+# Todos to improve template
+
+This is just a demo.
+
+- [ ] use multiple predictor variables and two predicted variables
+- [ ] for both `lead_time`s in one go
+- [ ] consider seasonality, for now all `forecast_time` months are mixed
+- [ ] make probabilistic predictions with `category` dim, for now works deterministic
+
+%% Cell type:markdown id: tags:
+
+# Imports
+
+%% Cell type:code id: tags:
+
+``` python
+from tensorflow.keras.layers import Input, Dense, Flatten
+from tensorflow.keras.models import Sequential
+
+import matplotlib.pyplot as plt
+
+import xarray as xr
+xr.set_options(display_style='text')
+import numpy as np
+
+from dask.utils import format_bytes
+import xskillscore as xs
+```
+
+%% Cell type:markdown id: tags:
+
+# Get training data
+
+preprocessing of input data may be done in separate notebook/script
+
+%% Cell type:markdown id: tags:
+
+## Hindcast
+
+get weekly initialized hindcasts
+
+%% Cell type:code id: tags:
+
+``` python
+v='t2m'
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# preprocessed as renku dataset
+!renku storage pull ../data/ecmwf_hindcast-input_2000-2019_biweekly_deterministic.zarr
+```
+
+%% Output
+
+    [33m[1mWarning: [0mRun CLI commands only from project's root directory.
+    [0m
+
+%% Cell type:code id: tags:
+
+``` python
+hind_2000_2019 = xr.open_zarr("../data/ecmwf_hindcast-input_2000-2019_biweekly_deterministic.zarr", consolidated=True)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# preprocessed as renku dataset
+!renku storage pull ../data/ecmwf_forecast-input_2020_biweekly_deterministic.zarr
+```
+
+%% Output
+
+    [33m[1mWarning: [0mRun CLI commands only from project's root directory.
+    [0m
+
+%% Cell type:code id: tags:
+
+``` python
+fct_2020 = xr.open_zarr("../data/ecmwf_forecast-input_2020_biweekly_deterministic.zarr", consolidated=True)
+```
+
+%% Cell type:markdown id: tags:
+
+## Observations
+corresponding to hindcasts
+
+%% Cell type:code id: tags:
+
+``` python
+# preprocessed as renku dataset
+!renku storage pull ../data/hindcast-like-observations_2000-2019_biweekly_deterministic.zarr
+```
+
+%% Output
+
+    [33m[1mWarning: [0mRun CLI commands only from project's root directory.
+    [0m
+
+%% Cell type:code id: tags:
+
+``` python
+obs_2000_2019 = xr.open_zarr("../data/hindcast-like-observations_2000-2019_biweekly_deterministic.zarr", consolidated=True)#[v]
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# preprocessed as renku dataset
+!renku storage pull ../data/forecast-like-observations_2020_biweekly_deterministic.zarr
+```
+
+%% Output
+
+    [33m[1mWarning: [0mRun CLI commands only from project's root directory.
+    [0m
+
+%% Cell type:code id: tags:
+
+``` python
+obs_2020 = xr.open_zarr("../data/forecast-like-observations_2020_biweekly_deterministic.zarr", consolidated=True)#[v]
+```
+
+%% Cell type:markdown id: tags:
+
+# ML model
+
+%% Cell type:markdown id: tags:
+
+based on [Weatherbench](https://github.com/pangeo-data/WeatherBench/blob/master/quickstart.ipynb)
+
+%% Cell type:code id: tags:
+
+``` python
+# run once only and dont commit
+!git clone https://github.com/pangeo-data/WeatherBench/
+```
+
+%% Output
+
+    fatal: destination path 'WeatherBench' already exists and is not an empty directory.
+
+%% Cell type:code id: tags:
+
+``` python
+import sys
+sys.path.insert(1, 'WeatherBench')
+from WeatherBench.src.train_nn import DataGenerator, PeriodicConv2D, create_predictions
+import tensorflow.keras as keras
+```
+
+%% Cell type:code id: tags:
+
+``` python
+bs=32
+
+import numpy as np
+class DataGenerator(keras.utils.Sequence):
+    def __init__(self, fct, verif, lead_time, batch_size=bs, shuffle=True, load=True,
+                 mean=None, std=None):
+        """
+        Data generator for WeatherBench data.
+        Template from https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly
+
+        Args:
+            fct: forecasts from S2S models: xr.DataArray (xr.Dataset doesnt work properly)
+            verif: observations with same dimensionality (xr.Dataset doesnt work properly)
+            lead_time: Lead_time as in model
+            batch_size: Batch size
+            shuffle: bool. If True, data is shuffled.
+            load: bool. If True, datadet is loaded into RAM.
+            mean: If None, compute mean from data.
+            std: If None, compute standard deviation from data.
+
+        Todo:
+        - use number in a better way, now uses only ensemble mean forecast
+        - dont use .sel(lead_time=lead_time) to train over all lead_time at once
+        - be sensitive with forecast_time, pool a few around the weekofyear given
+        - use more variables as predictors
+        - predict more variables
+        """
+
+        if isinstance(fct, xr.Dataset):
+            print('convert fct to array')
+            fct = fct.to_array().transpose(...,'variable')
+            self.fct_dataset=True
+        else:
+            self.fct_dataset=False
+
+        if isinstance(verif, xr.Dataset):
+            print('convert verif to array')
+            verif = verif.to_array().transpose(...,'variable')
+            self.verif_dataset=True
+        else:
+            self.verif_dataset=False
+
+        #self.fct = fct
+        self.batch_size = batch_size
+        self.shuffle = shuffle
+        self.lead_time = lead_time
+
+        self.fct_data = fct.transpose('forecast_time', ...).sel(lead_time=lead_time)
+        self.fct_mean = self.fct_data.mean('forecast_time').compute() if mean is None else mean
+        self.fct_std = self.fct_data.std('forecast_time').compute() if std is None else std
+
+        self.verif_data = verif.transpose('forecast_time', ...).sel(lead_time=lead_time)
+        self.verif_mean = self.verif_data.mean('forecast_time').compute() if mean is None else mean
+        self.verif_std = self.verif_data.std('forecast_time').compute() if std is None else std
+
+        # Normalize
+        self.fct_data = (self.fct_data - self.fct_mean) / self.fct_std
+        self.verif_data = (self.verif_data - self.verif_mean) / self.verif_std
+
+        self.n_samples = self.fct_data.forecast_time.size
+        self.forecast_time = self.fct_data.forecast_time
+
+        self.on_epoch_end()
+
+        # For some weird reason calling .load() earlier messes up the mean and std computations
+        if load:
+            # print('Loading data into RAM')
+            self.fct_data.load()
+
+    def __len__(self):
+        'Denotes the number of batches per epoch'
+        return int(np.ceil(self.n_samples / self.batch_size))
+
+    def __getitem__(self, i):
+        'Generate one batch of data'
+        idxs = self.idxs[i * self.batch_size:(i + 1) * self.batch_size]
+        # got all nan if nans not masked
+        X = self.fct_data.isel(forecast_time=idxs).fillna(0.).values
+        y = self.verif_data.isel(forecast_time=idxs).fillna(0.).values
+        return X, y
+
+    def on_epoch_end(self):
+        'Updates indexes after each epoch'
+        self.idxs = np.arange(self.n_samples)
+        if self.shuffle == True:
+            np.random.shuffle(self.idxs)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# 2 bi-weekly `lead_time`: week 3-4
+lead = hind_2000_2019.isel(lead_time=0).lead_time
+
+lead
+```
+
+%% Output
+
+    <xarray.DataArray 'lead_time' ()>
+    array(1209600000000000, dtype='timedelta64[ns]')
+    Coordinates:
+        lead_time  timedelta64[ns] 14 days
+    Attributes:
+        aggregate:      The pd.Timedelta corresponds to the first day of a biweek...
+        description:    Forecast period is the time interval between the forecast...
+        long_name:      lead time
+        standard_name:  forecast_period
+        week34_t2m:     mean[14 days, 27 days]
+        week34_tp:      28 days minus 14 days
+        week56_t2m:     mean[28 days, 41 days]
+        week56_tp:      42 days minus 28 days
+
+%% Cell type:code id: tags:
+
+``` python
+# mask, needed?
+hind_2000_2019 = hind_2000_2019.where(obs_2000_2019.isel(forecast_time=0, lead_time=0,drop=True).notnull())
+```
+
+%% Cell type:markdown id: tags:
+
+## data prep: train, valid, test
+
+[Use the hindcast period to split train and valid.](https://en.wikipedia.org/wiki/Training,_validation,_and_test_sets) Do not use the 2020 data for testing!
+
+%% Cell type:code id: tags:
+
+``` python
+# time is the forecast_time
+time_train_start,time_train_end='2000','2017' # train
+time_valid_start,time_valid_end='2018','2019' # valid
+time_test = '2020'                            # test
+```
+
+%% Cell type:code id: tags:
+
+``` python
+dg_train = DataGenerator(
+    hind_2000_2019.mean('realization').sel(forecast_time=slice(time_train_start,time_train_end))[v],
+    obs_2000_2019.sel(forecast_time=slice(time_train_start,time_train_end))[v],
+    lead_time=lead, batch_size=bs, load=True)
+```
+
+%% Output
+
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+
+%% Cell type:code id: tags:
+
+``` python
+dg_valid = DataGenerator(
+    hind_2000_2019.mean('realization').sel(forecast_time=slice(time_valid_start,time_valid_end))[v],
+    obs_2000_2019.sel(forecast_time=slice(time_valid_start,time_valid_end))[v],
+    lead_time=lead, batch_size=bs, shuffle=False, load=True)
+```
+
+%% Output
+
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+    /opt/conda/lib/python3.8/site-packages/dask/array/numpy_compat.py:39: RuntimeWarning: invalid value encountered in true_divide
+      x = np.divide(x1, x2, out)
+
+%% Cell type:code id: tags:
+
+``` python
+# do not use, delete?
+dg_test = DataGenerator(
+    fct_2020.mean('realization').sel(forecast_time=time_test)[v],
+    obs_2020.sel(forecast_time=time_test)[v],
+    lead_time=lead, batch_size=bs, load=True, mean=dg_train.fct_mean, std=dg_train.fct_std, shuffle=False)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+X, y = dg_valid[0]
+X.shape, y.shape
+```
+
+%% Output
+
+    ((32, 121, 240), (32, 121, 240))
+
+%% Cell type:code id: tags:
+
+``` python
+# short look into training data: large biases
+# any problem from normalizing?
+# i=4
+# xr.DataArray(np.vstack([X[i],y[i]])).plot(yincrease=False, robust=True)
+```
+
+%% Cell type:markdown id: tags:
+
+## `fit`
+
+%% Cell type:code id: tags:
+
+``` python
+cnn = keras.models.Sequential([
+    PeriodicConv2D(filters=32, kernel_size=5, conv_kwargs={'activation':'relu'}, input_shape=(32, 64, 1)),
+    PeriodicConv2D(filters=1, kernel_size=5)
+])
+```
+
+%% Output
+
+    WARNING:tensorflow:AutoGraph could not transform <bound method PeriodicPadding2D.call of <WeatherBench.src.train_nn.PeriodicPadding2D object at 0x7f86042986a0>> and will run it as-is.
+    Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
+    Cause: module 'gast' has no attribute 'Index'
+    To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
+    WARNING: AutoGraph could not transform <bound method PeriodicPadding2D.call of <WeatherBench.src.train_nn.PeriodicPadding2D object at 0x7f86042986a0>> and will run it as-is.
+    Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
+    Cause: module 'gast' has no attribute 'Index'
+    To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
+
+%% Cell type:code id: tags:
+
+``` python
+cnn.summary()
+```
+
+%% Output
+
+    Model: "sequential"
+    _________________________________________________________________
+    Layer (type)                 Output Shape              Param #
+    =================================================================
+    periodic_conv2d (PeriodicCon (None, 32, 64, 32)        832
+    _________________________________________________________________
+    periodic_conv2d_1 (PeriodicC (None, 32, 64, 1)         801
+    =================================================================
+    Total params: 1,633
+    Trainable params: 1,633
+    Non-trainable params: 0
+    _________________________________________________________________
+
+%% Cell type:code id: tags:
+
+``` python
+cnn.compile(keras.optimizers.Adam(1e-4), 'mse')
+```
+
+%% Cell type:code id: tags:
+
+``` python
+import warnings
+warnings.simplefilter("ignore")
+```
+
+%% Cell type:code id: tags:
+
+``` python
+cnn.fit(dg_train, epochs=2, validation_data=dg_valid)
+```
+
+%% Output
+
+    Epoch 1/2
+    30/30 [==============================] - 58s 2s/step - loss: 0.1472 - val_loss: 0.0742
+    Epoch 2/2
+    30/30 [==============================] - 45s 1s/step - loss: 0.0712 - val_loss: 0.0545
+
+    <tensorflow.python.keras.callbacks.History at 0x7f865c2103d0>
+
+%% Cell type:markdown id: tags:
+
+## `predict`
+
+Create predictions and print `mean(variable, lead_time, longitude, weighted latitude)` RPSS for all years as calculated by `skill_by_year`.
+
+%% Cell type:code id: tags:
+
+``` python
+from scripts import add_valid_time_from_forecast_reference_time_and_lead_time
+
+def _create_predictions(model, dg, lead):
+    """Create non-iterative predictions"""
+    preds = model.predict(dg).squeeze()
+    # Unnormalize
+    preds = preds * dg.fct_std.values + dg.fct_mean.values
+    if dg.verif_dataset:
+        da = xr.DataArray(
+                    preds,
+                    dims=['forecast_time', 'latitude', 'longitude','variable'],
+                    coords={'forecast_time': dg.fct_data.forecast_time, 'latitude': dg.fct_data.latitude,
+                            'longitude': dg.fct_data.longitude},
+                ).to_dataset() # doesnt work yet
+    else:
+        da = xr.DataArray(
+                    preds,
+                    dims=['forecast_time', 'latitude', 'longitude'],
+                    coords={'forecast_time': dg.fct_data.forecast_time, 'latitude': dg.fct_data.latitude,
+                            'longitude': dg.fct_data.longitude},
+                )
+    da = da.assign_coords(lead_time=lead)
+    # da = add_valid_time_from_forecast_reference_time_and_lead_time(da)
+    return da
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# optionally masking the ocean when making probabilistic
+mask = obs_2020.std(['lead_time','forecast_time']).notnull()
+```
+
+%% Cell type:code id: tags:
+
+``` python
+from scripts import make_probabilistic
+```
+
+%% Cell type:code id: tags:
+
+``` python
+!renku storage pull ../data/hindcast-like-observations_2000-2019_biweekly_tercile-edges.nc
+```
+
+%% Output
+
+    [33m[1mWarning: [0mRun CLI commands only from project's root directory.
+    [0m
+
+%% Cell type:code id: tags:
+
+``` python
+cache_path='../data'
+tercile_file = f'{cache_path}/hindcast-like-observations_2000-2019_biweekly_tercile-edges.nc'
+tercile_edges = xr.open_dataset(tercile_file)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# this is not useful but results have expected dimensions
+# actually train for each lead_time
+
+def create_predictions(cnn, fct, obs, time):
+    preds_test=[]
+    for lead in fct.lead_time:
+        dg = DataGenerator(fct.mean('realization').sel(forecast_time=time)[v],
+                           obs.sel(forecast_time=time)[v],
+                           lead_time=lead, batch_size=bs, mean=dg_train.fct_mean, std=dg_train.fct_std, shuffle=False)
+        preds_test.append(_create_predictions(cnn, dg, lead))
+    preds_test = xr.concat(preds_test, 'lead_time')
+    preds_test['lead_time'] = fct.lead_time
+    # add valid_time coord
+    preds_test = add_valid_time_from_forecast_reference_time_and_lead_time(preds_test)
+    preds_test = preds_test.to_dataset(name=v)
+    # add fake var
+    preds_test['tp'] = preds_test['t2m']
+    # make probabilistic
+    preds_test = make_probabilistic(preds_test.expand_dims('realization'), tercile_edges, mask=mask)
+    return preds_test
+```
+
+%% Cell type:markdown id: tags:
+
+### `predict` training period in-sample
+
+%% Cell type:code id: tags:
+
+``` python
+!renku storage pull ../data/forecast-like-observations_2020_biweekly_terciled.nc
+```
+
+%% Output
+
+    [33m[1mWarning: [0mRun CLI commands only from project's root directory.
+    [0m
+
+%% Cell type:code id: tags:
+
+``` python
+!renku storage pull ../data/hindcast-like-observations_2000-2019_biweekly_terciled.zarr
+```
+
+%% Output
+
+    [33m[1mWarning: [0mRun CLI commands only from project's root directory.
+    [0m
+
+%% Cell type:code id: tags:
+
+``` python
+from scripts import skill_by_year
+import os
+if os.environ['HOME'] == '/home/jovyan':
+    import pandas as pd
+    # assume on renku with small memory
+    step = 2
+    skill_list = []
+    for year in np.arange(int(time_train_start), int(time_train_end) -1, step): # loop over years to consume less memory on renku
+        preds_is = create_predictions(cnn, hind_2000_2019, obs_2000_2019, time=slice(str(year), str(year+step-1))).compute()
+        skill_list.append(skill_by_year(preds_is))
+    skill = pd.concat(skill_list)
+else: # with larger memory, simply do
+    preds_is = create_predictions(cnn, hind_2000_2019, obs_2000_2019, time=slice(time_train_start, time_train_end))
+    skill = skill_by_year(preds_is)
+skill
+```
+
+%% Output
+
+              RPSS
+    year
+    2000 -0.862483
+    2001 -1.015485
+    2002 -1.101022
+    2003 -1.032647
+    2004 -1.056348
+    2005 -1.165675
+    2006 -1.057217
+    2007 -1.170849
+    2008 -1.049785
+    2009 -1.169108
+    2010 -1.130845
+    2011 -1.052670
+    2012 -1.126449
+    2013 -1.126930
+    2014 -1.095896
+    2015 -1.117486
+
+%% Cell type:markdown id: tags:
+
+### `predict` validation period out-of-sample
+
+%% Cell type:code id: tags:
+
+``` python
+preds_os = create_predictions(cnn, hind_2000_2019, obs_2000_2019, time=slice(time_valid_start, time_valid_end))
+
+skill_by_year(preds_os)
+```
+
+%% Output
+
+              RPSS
+    year
+    2018 -1.099744
+    2019 -1.172401
+
+%% Cell type:markdown id: tags:
+
+### `predict` test
+
+%% Cell type:code id: tags:
+
+``` python
+preds_test = create_predictions(cnn, fct_2020, obs_2020, time=time_test)
+
+skill_by_year(preds_test)
+```
+
+%% Output
+
+              RPSS
+    year
+    2020 -1.076834
+
+%% Cell type:markdown id: tags:
+
+# Submission
+
+%% Cell type:code id: tags:
+
+``` python
+from scripts import assert_predictions_2020
+assert_predictions_2020(preds_test)
+```
+
+%% Cell type:code id: tags:
+
+``` python
+preds_test.to_netcdf('../submissions/ML_prediction_2020.nc')
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# !git add ../submissions/ML_prediction_2020.nc
+# !git add ML_train_and_prediction.ipynb
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# !git commit -m "template_test commit message" # whatever message you want
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# !git tag "submission-template_test-0.0.1" # if this is to be checked by scorer, only the last submitted==tagged version will be considered
+```
+
+%% Cell type:code id: tags:
+
+``` python
+# !git push --tags
+```
+
+%% Cell type:code id: tags:
+
+``` python
+```
+
+%% Cell type:markdown id: tags:
+
+# Reproducibility
+
+%% Cell type:markdown id: tags:
+
+## memory
+
+%% Cell type:code id: tags:
+
+``` python
+# https://phoenixnap.com/kb/linux-commands-check-memory-usage
+!free -g
+```
+
+%% Output
+
+                  total        used        free      shared  buff/cache   available
+    Mem:             31           7          11           0          12          24
+    Swap:             0           0           0
+
+%% Cell type:markdown id: tags:
+
+## CPU
+
+%% Cell type:code id: tags:
+
+``` python
+!lscpu
+```
+
+%% Output
+
+    Architecture:                    x86_64
+    CPU op-mode(s):                  32-bit, 64-bit
+    Byte Order:                      Little Endian
+    Address sizes:                   40 bits physical, 48 bits virtual
+    CPU(s):                          8
+    On-line CPU(s) list:             0-7
+    Thread(s) per core:              1
+    Core(s) per socket:              1
+    Socket(s):                       8
+    NUMA node(s):                    1
+    Vendor ID:                       GenuineIntel
+    CPU family:                      6
+    Model:                           85
+    Model name:                      Intel Xeon Processor (Skylake, IBRS)
+    Stepping:                        4
+    CPU MHz:                         2095.078
+    BogoMIPS:                        4190.15
+    Virtualization:                  VT-x
+    Hypervisor vendor:               KVM
+    Virtualization type:             full
+    L1d cache:                       256 KiB
+    L1i cache:                       256 KiB
+    L2 cache:                        32 MiB
+    L3 cache:                        128 MiB
+    NUMA node0 CPU(s):               0-7
+    Vulnerability Itlb multihit:     KVM: Mitigation: Split huge pages
+    Vulnerability L1tf:              Mitigation; PTE Inversion; VMX conditional cach
+                                     e flushes, SMT disabled
+    Vulnerability Mds:               Vulnerable: Clear CPU buffers attempted, no mic
+                                     rocode; SMT Host state unknown
+    Vulnerability Meltdown:          Mitigation; PTI
+    Vulnerability Spec store bypass: Vulnerable
+    Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user
+                                      pointer sanitization
+    Vulnerability Spectre v2:        Mitigation; Full generic retpoline, IBPB condit
+                                     ional, IBRS_FW, STIBP disabled, RSB filling
+    Vulnerability Srbds:             Not affected
+    Vulnerability Tsx async abort:   Not affected
+    Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtr
+                                     r pge mca cmov pat pse36 clflush mmx fxsr sse s
+                                     se2 syscall nx pdpe1gb rdtscp lm constant_tsc r
+                                     ep_good nopl xtopology cpuid tsc_known_freq pni
+                                      pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_
+                                     2 x2apic movbe popcnt tsc_deadline_timer aes xs
+                                     ave avx f16c rdrand hypervisor lahf_lm abm 3dno
+                                     wprefetch cpuid_fault invpcid_single pti ibrs i
+                                     bpb tpr_shadow vnmi flexpriority ept vpid ept_a
+                                     d fsgsbase bmi1 avx2 smep bmi2 erms invpcid avx
+                                     512f avx512dq rdseed adx smap clwb avx512cd avx
+                                     512bw avx512vl xsaveopt xsavec xgetbv1 arat pku
+                                      ospke
+
+%% Cell type:markdown id: tags:
+
+## software
+
+%% Cell type:code id: tags:
+
+``` python
+!conda list
+```
+
+%% Output
+
+    # packages in environment at /opt/conda:
+    #
+    # Name                    Version                   Build  Channel
+    _libgcc_mutex             0.1                 conda_forge    conda-forge
+    _openmp_mutex             4.5                       1_gnu    conda-forge
+    _pytorch_select           0.1                       cpu_0    defaults
+    _tflow_select             2.3.0                       mkl    defaults
+    absl-py                   0.13.0           py38h06a4308_0    defaults
+    aiobotocore               1.4.1              pyhd3eb1b0_0    defaults
+    aiohttp                   3.7.4.post0      py38h7f8727e_2    defaults
+    aioitertools              0.7.1              pyhd3eb1b0_0    defaults
+    alembic                   1.4.3              pyh9f0ad1d_0    conda-forge
+    ansiwrap                  0.8.4                    pypi_0    pypi
+    appdirs                   1.4.4                    pypi_0    pypi
+    argcomplete               1.12.3                   pypi_0    pypi
+    argon2-cffi               20.1.0           py38h497a2fe_2    conda-forge
+    argparse                  1.4.0                    pypi_0    pypi
+    asciitree                 0.3.3                      py_2    defaults
+    astor                     0.8.1            py38h06a4308_0    defaults
+    astunparse                1.6.3                      py_0    defaults
+    async-timeout             3.0.1                    pypi_0    pypi
+    async_generator           1.10                       py_0    conda-forge
+    attrs                     21.2.0                   pypi_0    pypi
+    backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
+    backports                 1.0                        py_2    conda-forge
+    backports.functools_lru_cache 1.6.1                      py_0    conda-forge
+    bagit                     1.8.1                    pypi_0    pypi
+    beautifulsoup4            4.10.0             pyh06a4308_0    defaults
+    binutils_impl_linux-64    2.35.1               h193b22a_1    conda-forge
+    binutils_linux-64         2.35                h67ddf6f_30    conda-forge
+    black                     20.8b1                   pypi_0    pypi
+    blas                      1.0                         mkl    defaults
+    bleach                    3.2.1              pyh9f0ad1d_0    conda-forge
+    blinker                   1.4                        py_1    conda-forge
+    bokeh                     2.3.3            py38h06a4308_0    defaults
+    botocore                  1.20.106           pyhd3eb1b0_0    defaults
+    bottleneck                1.3.2            py38heb32a55_1    defaults
+    bracex                    2.1.1                    pypi_0    pypi
+    branca                    0.3.1                    pypi_0    pypi
+    brotli                    1.0.9                he6710b0_2    defaults
+    brotlipy                  0.7.0           py38h497a2fe_1001    conda-forge
+    bzip2                     1.0.8                h7f98852_4    conda-forge
+    c-ares                    1.17.1               h36c2ea0_0    conda-forge
+    ca-certificates           2021.7.5             h06a4308_1    defaults
+    cachecontrol              0.12.6                   pypi_0    pypi
+    cachetools                4.2.4                    pypi_0    pypi
+    calamus                   0.3.12                   pypi_0    pypi
+    cdsapi                    0.5.1                    pypi_0    pypi
+    certifi                   2021.5.30                pypi_0    pypi
+    certipy                   0.1.3                      py_0    conda-forge
+    cffi                      1.14.6                   pypi_0    pypi
+    cfgrib                    0.9.9.0            pyhd8ed1ab_1    conda-forge
+    cftime                    1.5.0            py38h6323ea4_0    defaults
+    chardet                   3.0.4                    pypi_0    pypi
+    click                     7.1.2                    pypi_0    pypi
+    click-completion          0.5.2                    pypi_0    pypi
+    click-option-group        0.5.3                    pypi_0    pypi
+    click-plugins             1.1.1                    pypi_0    pypi
+    climetlab                 0.8.31                   pypi_0    pypi
+    climetlab-s2s-ai-challenge 0.8.0                    pypi_0    pypi
+    cloudpickle               2.0.0              pyhd3eb1b0_0    defaults
+    colorama                  0.4.4                    pypi_0    pypi
+    coloredlogs               15.0.1                   pypi_0    pypi
+    commonmark                0.9.1                    pypi_0    pypi
+    conda                     4.9.2            py38h578d9bd_0    conda-forge
+    conda-package-handling    1.7.2            py38h8df0ef7_0    conda-forge
+    configargparse            1.5.2                    pypi_0    pypi
+    configurable-http-proxy   1.3.0                         0    conda-forge
+    coverage                  5.5              py38h27cfd23_2    defaults
+    cryptography              3.4.8                    pypi_0    pypi
+    curl                      7.71.1               he644dc0_8    conda-forge
+    cwlgen                    0.4.2                    pypi_0    pypi
+    cwltool                   3.1.20211004060744          pypi_0    pypi
+    cycler                    0.10.0                   py38_0    defaults
+    cython                    0.29.24          py38h295c915_0    defaults
+    cytoolz                   0.11.0           py38h7b6447c_0    defaults
+    dask                      2021.8.1           pyhd3eb1b0_0    defaults
+    dask-core                 2021.8.1           pyhd3eb1b0_0    defaults
+    dataclasses               0.8                pyh6d0b6a4_7    defaults
+    decorator                 4.4.2                      py_0    conda-forge
+    defusedxml                0.6.0                      py_0    conda-forge
+    distributed               2021.8.1         py38h06a4308_0    defaults
+    distro                    1.5.0                    pypi_0    pypi
+    docopt                    0.6.2            py38h06a4308_0    defaults
+    eccodes                   2.21.0               ha0e6eb6_0    conda-forge
+    ecmwf-api-client          1.6.1                    pypi_0    pypi
+    ecmwflibs                 0.3.14                   pypi_0    pypi
+    entrypoints               0.3             pyhd8ed1ab_1003    conda-forge
+    environ-config            21.2.0                   pypi_0    pypi
+    fasteners                 0.16.3             pyhd3eb1b0_0    defaults
+    filelock                  3.0.12                   pypi_0    pypi
+    findlibs                  0.0.2                    pypi_0    pypi
+    fonttools                 4.25.0             pyhd3eb1b0_0    defaults
+    freetype                  2.10.4               h5ab3b9f_0    defaults
+    frozendict                2.0.6                    pypi_0    pypi
+    fsspec                    2021.7.0           pyhd3eb1b0_0    defaults
+    gast                      0.4.0              pyhd3eb1b0_0    defaults
+    gcc_impl_linux-64         9.3.0               h70c0ae5_18    conda-forge
+    gcc_linux-64              9.3.0               hf25ea35_30    conda-forge
+    gitdb                     4.0.7                    pypi_0    pypi
+    gitpython                 3.1.14                   pypi_0    pypi
+    google-auth               1.33.0             pyhd3eb1b0_0    defaults
+    google-auth-oauthlib      0.4.4              pyhd3eb1b0_0    defaults
+    google-pasta              0.2.0              pyhd3eb1b0_0    defaults
+    grpcio                    1.36.1           py38h2157cd5_1    defaults
+    gxx_impl_linux-64         9.3.0               hd87eabc_18    conda-forge
+    gxx_linux-64              9.3.0               h3fbe746_30    conda-forge
+    h5netcdf                  0.11.0             pyhd8ed1ab_0    conda-forge
+    h5py                      2.10.0           py38hd6299e0_1    defaults
+    hdf4                      4.2.13               h3ca952b_2    defaults
+    hdf5                      1.10.6          nompi_h6a2412b_1114    conda-forge
+    heapdict                  1.0.1              pyhd3eb1b0_0    defaults
+    humanfriendly             10.0                     pypi_0    pypi
+    humanize                  3.7.1                    pypi_0    pypi
+    icu                       68.1                 h58526e2_0    conda-forge
+    idna                      2.10               pyh9f0ad1d_0    conda-forge
+    importlib-metadata        3.4.0            py38h578d9bd_0    conda-forge
+    importlib_metadata        3.4.0                hd8ed1ab_0    conda-forge
+    intake                    0.6.3              pyhd3eb1b0_0    defaults
+    intake-xarray             0.5.0              pyhd3eb1b0_0    defaults
+    intel-openmp              2019.4                      243    defaults
+    ipykernel                 5.4.2            py38h81c977d_0    conda-forge
+    ipython                   7.19.0           py38h81c977d_2    conda-forge
+    ipython_genutils          0.2.0                      py_1    conda-forge
+    isodate                   0.6.0                    pypi_0    pypi
+    jasper                    1.900.1              hd497a04_4    defaults
+    jedi                      0.17.2           py38h578d9bd_1    conda-forge
+    jellyfish                 0.8.8                    pypi_0    pypi
+    jinja2                    3.0.1                    pypi_0    pypi
+    jmespath                  0.10.0             pyhd3eb1b0_0    defaults
+    joblib                    1.0.1              pyhd3eb1b0_0    defaults
+    jpeg                      9d                   h7f8727e_0    defaults
+    json5                     0.9.5              pyh9f0ad1d_0    conda-forge
+    jsonschema                3.2.0                      py_2    conda-forge
+    jupyter-server-proxy      1.6.0                    pypi_0    pypi
+    jupyter_client            6.1.11             pyhd8ed1ab_1    conda-forge
+    jupyter_core              4.7.0            py38h578d9bd_0    conda-forge
+    jupyter_telemetry         0.1.0              pyhd8ed1ab_1    conda-forge
+    jupyterhub                1.2.2                    pypi_0    pypi
+    jupyterlab                2.2.9                      py_0    conda-forge
+    jupyterlab-git            0.23.3                   pypi_0    pypi
+    jupyterlab_pygments       0.1.2              pyh9f0ad1d_0    conda-forge
+    jupyterlab_server         1.2.0                      py_0    conda-forge
+    keras-preprocessing       1.1.2              pyhd3eb1b0_0    defaults
+    kernel-headers_linux-64   2.6.32              h77966d4_13    conda-forge
+    kiwisolver                1.3.1            py38h2531618_0    defaults
+    krb5                      1.17.2               h926e7f8_0    conda-forge
+    lazy-object-proxy         1.6.0                    pypi_0    pypi
+    lcms2                     2.12                 h3be6417_0    defaults
+    ld_impl_linux-64          2.35.1               hea4e1c9_1    conda-forge
+    libaec                    1.0.4                he6710b0_1    defaults
+    libblas                   3.9.0           1_h86c2bf4_netlib    conda-forge
+    libcblas                  3.9.0           5_h92ddd45_netlib    conda-forge
+    libcurl                   7.71.1               hcdd3856_8    conda-forge
+    libedit                   3.1.20191231         he28a2e2_2    conda-forge
+    libev                     4.33                 h516909a_1    conda-forge
+    libffi                    3.3                  h58526e2_2    conda-forge
+    libgcc-devel_linux-64     9.3.0               h7864c58_18    conda-forge
+    libgcc-ng                 9.3.0               h2828fa1_18    conda-forge
+    libgfortran-ng            9.3.0               ha5ec8a7_17    defaults
+    libgfortran5              9.3.0               ha5ec8a7_17    defaults
+    libgomp                   9.3.0               h2828fa1_18    conda-forge
+    liblapack                 3.9.0           5_h92ddd45_netlib    conda-forge
+    libllvm10                 10.0.1               hbcb73fb_5    defaults
+    libmklml                  2019.0.5                      0    defaults
+    libnetcdf                 4.7.4           nompi_h56d31a8_107    conda-forge
+    libnghttp2                1.41.0               h8cfc5f6_2    conda-forge
+    libpng                    1.6.37               hbc83047_0    defaults
+    libprotobuf               3.17.2               h4ff587b_1    defaults
+    libsodium                 1.0.18               h36c2ea0_1    conda-forge
+    libssh2                   1.9.0                hab1572f_5    conda-forge
+    libstdcxx-devel_linux-64  9.3.0               hb016644_18    conda-forge
+    libstdcxx-ng              9.3.0               h6de172a_18    conda-forge
+    libtiff                   4.2.0                h85742a9_0    defaults
+    libuv                     1.40.0               h7f98852_0    conda-forge
+    libwebp-base              1.2.0                h27cfd23_0    defaults
+    llvmlite                  0.36.0           py38h612dafd_4    defaults
+    locket                    0.2.1            py38h06a4308_1    defaults
+    lockfile                  0.12.2                   pypi_0    pypi
+    lxml                      4.6.3                    pypi_0    pypi
+    lz4-c                     1.9.3                h295c915_1    defaults
+    magics                    1.5.6                    pypi_0    pypi
+    mako                      1.1.4              pyh44b312d_0    conda-forge
+    markdown                  3.3.4            py38h06a4308_0    defaults
+    markupsafe                2.0.1                    pypi_0    pypi
+    marshmallow               3.13.0                   pypi_0    pypi
+    matplotlib-base           3.4.2            py38hab158f2_0    defaults
+    mistune                   0.8.4           py38h497a2fe_1003    conda-forge
+    mkl                       2020.2                      256    defaults
+    mkl-service               2.3.0            py38he904b0f_0    defaults
+    mkl_fft                   1.3.0            py38h54f3939_0    defaults
+    mkl_random                1.1.1            py38h0573a6f_0    defaults
+    msgpack-python            1.0.2            py38hff7bd54_1    defaults
+    multidict                 5.1.0            py38h27cfd23_2    defaults
+    munkres                   1.1.4                      py_0    defaults
+    mypy-extensions           0.4.3                    pypi_0    pypi
+    nbclient                  0.5.0                    pypi_0    pypi
+    nbconvert                 6.0.7            py38h578d9bd_3    conda-forge
+    nbdime                    2.1.0                    pypi_0    pypi
+    nbformat                  5.1.2              pyhd8ed1ab_1    conda-forge
+    nbresuse                  0.4.0                    pypi_0    pypi
+    nc-time-axis              1.3.1              pyhd8ed1ab_2    conda-forge
+    ncurses                   6.2                  h58526e2_4    conda-forge
+    ndg-httpsclient           0.5.1                    pypi_0    pypi
+    nest-asyncio              1.4.3              pyhd8ed1ab_0    conda-forge
+    netcdf4                   1.5.4                    pypi_0    pypi
+    networkx                  2.6.3                    pypi_0    pypi
+    ninja                     1.10.2               hff7bd54_1    defaults
+    nodejs                    15.3.0               h25f6087_0    conda-forge
+    notebook                  6.2.0            py38h578d9bd_0    conda-forge
+    numba                     0.53.1           py38ha9443f7_0    defaults
+    numcodecs                 0.8.0            py38h2531618_0    defaults
+    numexpr                   2.7.3            py38hb2eb853_0    defaults
+    numpy                     1.19.2           py38h54aff64_0    defaults
+    numpy-base                1.19.2           py38hfa32c7d_0    defaults
+    oauthlib                  3.0.1                      py_0    conda-forge
+    olefile                   0.46               pyhd3eb1b0_0    defaults
+    openjpeg                  2.4.0                h3ad879b_0    defaults
+    openssl                   1.1.1l               h7f8727e_0    defaults
+    opt_einsum                3.3.0              pyhd3eb1b0_1    defaults
+    owlrl                     5.2.3                    pypi_0    pypi
+    packaging                 20.8               pyhd3deb0d_0    conda-forge
+    pamela                    1.0.0                      py_0    conda-forge
+    pandas                    1.3.2            py38h8c16a72_0    defaults
+    pandoc                    2.11.3.2             h7f98852_0    conda-forge
+    pandocfilters             1.4.2                      py_1    conda-forge
+    papermill                 2.3.1                    pypi_0    pypi
+    parso                     0.7.1              pyh9f0ad1d_0    conda-forge
+    partd                     1.2.0              pyhd3eb1b0_0    defaults
+    pathspec                  0.9.0                    pypi_0    pypi
+    patool                    1.12                     pypi_0    pypi
+    pdbufr                    0.9.0                    pypi_0    pypi
+    pexpect                   4.8.0              pyh9f0ad1d_2    conda-forge
+    pickleshare               0.7.5                   py_1003    conda-forge
+    pillow                    8.3.1            py38h2c7a002_0    defaults
+    pip                       21.0.1                   pypi_0    pypi
+    pipx                      0.16.1.0                 pypi_0    pypi
+    pluggy                    0.13.1                   pypi_0    pypi
+    portalocker               2.3.2                    pypi_0    pypi
+    powerline-shell           0.7.0                    pypi_0    pypi
+    prometheus_client         0.9.0              pyhd3deb0d_0    conda-forge
+    prompt-toolkit            3.0.10             pyha770c72_0    conda-forge
+    properscoring             0.1                        py_0    conda-forge
+    protobuf                  3.17.2           py38h295c915_0    defaults
+    prov                      1.5.1                    pypi_0    pypi
+    psutil                    5.8.0            py38h27cfd23_1    defaults
+    ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
+    pyasn1                    0.4.8              pyhd3eb1b0_0    defaults
+    pyasn1-modules            0.2.8                      py_0    defaults
+    pycosat                   0.6.3           py38h497a2fe_1006    conda-forge
+    pycparser                 2.20               pyh9f0ad1d_2    conda-forge
+    pycurl                    7.43.0.6         py38h996a351_1    conda-forge
+    pydap                     3.2.2           pyh9f0ad1d_1001    conda-forge
+    pydot                     1.4.2                    pypi_0    pypi
+    pygments                  2.10.0                   pypi_0    pypi
+    pyjwt                     2.1.0                    pypi_0    pypi
+    pyld                      2.0.3                    pypi_0    pypi
+    pyodc                     1.1.1                    pypi_0    pypi
+    pyopenssl                 20.0.1             pyhd8ed1ab_0    conda-forge
+    pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
+    pyrsistent                0.17.3           py38h497a2fe_2    conda-forge
+    pyshacl                   0.17.0.post1             pypi_0    pypi
+    pysocks                   1.7.1            py38h578d9bd_3    conda-forge
+    python                    3.8.6           hffdb5ce_4_cpython    conda-forge
+    python-dateutil           2.8.1                      py_0    conda-forge
+    python-eccodes            2021.03.0        py38hb5d20a5_1    conda-forge
+    python-editor             1.0.4                    pypi_0    pypi
+    python-flatbuffers        1.12               pyhd3eb1b0_0    defaults
+    python-json-logger        2.0.1              pyh9f0ad1d_0    conda-forge
+    python-snappy             0.6.0            py38h2531618_3    defaults
+    python_abi                3.8                      1_cp38    conda-forge
+    pytorch                   1.8.1           cpu_py38h60491be_0    defaults
+    pytz                      2021.1             pyhd3eb1b0_0    defaults
+    pyyaml                    5.4.1                    pypi_0    pypi
+    pyzmq                     21.0.1           py38h3d7ac18_0    conda-forge
+    rdflib                    6.0.1                    pypi_0    pypi
+    rdflib-jsonld             0.5.0                    pypi_0    pypi
+    readline                  8.0                  he28a2e2_2    conda-forge
+    regex                     2021.4.4                 pypi_0    pypi
+    renku                     0.16.2                   pypi_0    pypi
+    requests                  2.24.0                   pypi_0    pypi
+    requests-oauthlib         1.3.0                      py_0    defaults
+    rich                      10.3.0                   pypi_0    pypi
+    rsa                       4.7.2              pyhd3eb1b0_1    defaults
+    ruamel-yaml               0.16.5                   pypi_0    pypi
+    ruamel.yaml.clib          0.2.2            py38h497a2fe_2    conda-forge
+    ruamel_yaml               0.15.80         py38h497a2fe_1003    conda-forge
+    s3fs                      2021.7.0           pyhd3eb1b0_0    defaults
+    schema-salad              8.2.20210918131710          pypi_0    pypi
+    scikit-learn              0.24.2           py38ha9443f7_0    defaults
+    scipy                     1.7.0            py38h7b17777_1    conda-forge
+    send2trash                1.5.0                      py_0    conda-forge
+    setuptools                58.2.0                   pypi_0    pypi
+    setuptools-scm            6.0.1                    pypi_0    pypi
+    shellescape               3.8.1                    pypi_0    pypi
+    shellingham               1.4.0                    pypi_0    pypi
+    simpervisor               0.4                      pypi_0    pypi
+    six                       1.16.0                   pypi_0    pypi
+    smmap                     4.0.0                    pypi_0    pypi
+    snappy                    1.1.8                he6710b0_0    defaults
+    sortedcontainers          2.4.0              pyhd3eb1b0_0    defaults
+    soupsieve                 2.2.1              pyhd3eb1b0_0    defaults
+    sqlalchemy                1.3.22           py38h497a2fe_1    conda-forge
+    sqlite                    3.34.0               h74cdb3f_0    conda-forge
+    sysroot_linux-64          2.12                h77966d4_13    conda-forge
+    tabulate                  0.8.9                    pypi_0    pypi
+    tbb                       2020.3               hfd86e86_0    defaults
+    tblib                     1.7.0              pyhd3eb1b0_0    defaults
+    tenacity                  7.0.0                    pypi_0    pypi
+    tensorboard               2.4.0              pyhc547734_0    defaults
+    tensorboard-plugin-wit    1.6.0                      py_0    defaults
+    tensorflow                2.4.1           mkl_py38hb2083e0_0    defaults
+    tensorflow-base           2.4.1           mkl_py38h43e0292_0    defaults
+    tensorflow-estimator      2.6.0              pyh7b7c402_0    defaults
+    termcolor                 1.1.0            py38h06a4308_1    defaults
+    terminado                 0.9.2            py38h578d9bd_0    conda-forge
+    testpath                  0.4.4                      py_0    conda-forge
+    textwrap3                 0.9.2                    pypi_0    pypi
+    threadpoolctl             2.2.0              pyh0d69192_0    defaults
+    tini                      0.18.0            h14c3975_1001    conda-forge
+    tk                        8.6.10               h21135ba_1    conda-forge
+    toml                      0.10.2                   pypi_0    pypi
+    toolz                     0.11.1             pyhd3eb1b0_0    defaults
+    tornado                   6.1              py38h497a2fe_1    conda-forge
+    tqdm                      4.60.0                   pypi_0    pypi
+    traitlets                 5.0.5                      py_0    conda-forge
+    typed-ast                 1.4.2                    pypi_0    pypi
+    typing-extensions         3.7.4.3                  pypi_0    pypi
+    typing_extensions         3.10.0.2           pyh06a4308_0    defaults
+    urllib3                   1.25.11                  pypi_0    pypi
+    userpath                  1.4.2                    pypi_0    pypi
+    wcmatch                   8.2                      pypi_0    pypi
+    wcwidth                   0.2.5              pyh9f0ad1d_2    conda-forge
+    webencodings              0.5.1                      py_1    conda-forge
+    webob                     1.8.7              pyhd3eb1b0_0    defaults
+    werkzeug                  2.0.1              pyhd3eb1b0_0    defaults
+    wheel                     0.36.2             pyhd3deb0d_0    conda-forge
+    wrapt                     1.12.1           py38h7b6447c_1    defaults
+    xarray                    0.19.0             pyhd3eb1b0_1    defaults
+    xhistogram                0.3.0              pyhd8ed1ab_0    conda-forge
+    xskillscore               0.0.23             pyhd8ed1ab_0    conda-forge
+    xz                        5.2.5                h516909a_1    conda-forge
+    yagup                     0.1.1                    pypi_0    pypi
+    yaml                      0.2.5                h516909a_0    conda-forge
+    yarl                      1.6.3            py38h27cfd23_0    defaults
+    zarr                      2.8.1              pyhd3eb1b0_0    defaults
+    zeromq                    4.3.3                h58526e2_3    conda-forge
+    zict                      2.0.0              pyhd3eb1b0_0    defaults
+    zipp                      3.4.0                      py_0    conda-forge
+    zlib                      1.2.11            h516909a_1010    conda-forge
+    zstd                      1.4.9                haebb681_0    defaults
+
+%% Cell type:code id: tags:
+
+``` python
+```
No results found