Skip to content
Snippets Groups Projects
Forked from COVID-19 / covid-19-public-data
4090 commits behind the upstream repository.
Rok Roškar's avatar
Rok Roškar authored
renku run --input data/ch-population-statistics/ch-population-by-age-canton.xls papermill --report-mode notebooks/examples/openzh-covid-19-example.ipynb runs/openzh-covid-19-example.run.ipynb -p save_figures True -p data_path data/openzh-covid-19/ -p figures_path figures/
4d86cd7d

Covid-19 Public Data Collaboration Project

This project aggregates data from various public data sources to better understand the spread and effect of covid-19. The goal is to provide a central place where data, analysis, and discussion can be conducted and shared by a global community struggling to make sense of the current public health emergency.

See the dashboard for a summary of the global data.

Getting started

The simplest way to start is to make an account or logging in and forking the project. Then, feel free to start an interactive environment and use the hosted JupyterLab or RStudio to explore the data. A summary of the data is given below. Please please please consider contributing back cool results from your fork! If you don't know how or just need help with some of the git-heavy aspects of this, shoot us a line on Discourse or open an issue and someone will be able to help out.

The environment image allows you to work in Python or R in JupyterLab or RStudio/Shiny.

Dataset Summary

Source Dataset Location Example
Covid-19 Data Repository at JHU CSSE covid-19_jhu-csse data/covid-19_jhu-csse dashboard
covidtracking.com covidtracking data/covidtracking dashboard
OpenData Zuerich openzh-covid-19 data/openzh-covid-19 dashboard
Covid-19 data for Italy covid-19-italy data/covid-19-italy notebook, dashboard
Covid-19 tweet IDs covid-19-tweet-ids data/covid-19-tweet-ids N/A

Covid-19 Data Repository JHU CSSE

This is a global Covid-19 dataset updated regularly from Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE). The dashboard summarizes this data in combination with population data from the world bank.

Covid tracking crowdsourcing project

Covid tracking is a crowd-sourced dataset for US state-level data. It is updated by hand by an army of volunteers.

OpenData Zuerich

The swiss cantonal data collected by the Zürich statistical office. Parts are updated manually, others are starting to become automated.

Case data for Italy

Detailed data compiled by the Civil Protection of Italy.

Covid-19 related tweet IDs

A collection of tweet-ids related to covid-19 from https://github.com/echen102/COVID-19-TweetIDs.

General

Derived Dataset Summary

Dataset Location Code
Case population rates data/covid-19_rates notebooks/process/ToRates.ipynb

Contributing

If you are interested in working on this project, we would love to get contributions. We would really like to collect more data sources and make them available here! Please provide ideas for data sources that are relevant to understanding covid-19.

If you want to add a new datasource yourself, see the section Adding a new data source

Data Sources to Add

See the data sources issue.

Adding a new data source

Adding a new data source is easy! To do so, in your fork or branch of the project, do the following:

  • Create a renku dataset using renku dataset create [dataset name]
  • Add any files or folders using renku dataset add. Looking in the commit history will provide some examples.
  • Create a notebook that shows how to read and work with the dataset in the notebooks/examples folder
    • Protip: use a unique name for the notebook to avoid merge conflicts
  • Add an issue to the project for any suggestions on things to do with the data