Skip to content
Snippets Groups Projects
README.md 6.16 KiB
Newer Older
Rok Roškar's avatar
Rok Roškar committed
# Covid-19 Public Data Collaboration Project
Rok Roškar's avatar
Rok Roškar committed
This project aggregates data from various public data sources to better
understand the spread and effect of covid-19. The goal is to provide a central
place where data, analysis, and discussion can be conducted and shared by a
global community struggling to make sense of the current public health
emergency.
Rok Roškar's avatar
Rok Roškar committed
See the [dashboard](covid-19-public-data/files/blob/runs/Dashboard.run.ipynb)
for a summary of the global data.
Rok Roškar's avatar
Rok Roškar committed
## Getting started
Rok Roškar's avatar
Rok Roškar committed
The simplest way to start is to make an account or logging in and forking the
Rok Roškar's avatar
Rok Roškar committed
project. Then, feel free to [start an interactive
environment](https://renkulab.io/projects/covid-19/covid-19-public-data/environments/new)
and use the hosted JupyterLab or RStudio to explore the data. A summary of the
data is given below. Please please please consider contributing back cool
Rok Roškar's avatar
Rok Roškar committed
results from your fork! If you don't know how or just need help with some of the
Rok Roškar's avatar
Rok Roškar committed
git-heavy aspects of this, shoot us a line [on
Discourse](https://renku.discourse.group) or [open an
issue](https://renkulab.io/projects/covid-19/covid-19-public-data/collaboration/issues)
and someone will be able to help out. 
Rok Roškar's avatar
Rok Roškar committed
The environment image allows you to work in Python or R in JupyterLab or RStudio/Shiny.
Rok Roškar's avatar
Rok Roškar committed
## Dataset Summary
<table class="table">
<thead>
<tr>
<th>Source</th>
<th>Dataset</th>
<th>Location</th>
<th>Example</th>
</tr>
</thead>
<tbody>
<tr>
<td><a href="https://github.com/CSSEGISandData/COVID-19">Covid-19 Data Repository at JHU CSSE</a></td>
<td><a href="https://renkulab.io/projects/covid-19/covid-19-public-data/datasets/f6726a5b-f973-45d5-b873-30fa0dff772f/">covid-19_jhu-csse</a></td>
<td><code>data/covid-19_jhu-csse</code></td>
<td><a href="https://renkulab.io/projects/covid-19/covid-19-public-data/files/blob/runs/Dashboard.run.ipynb">dashboard</a></td>
</tr>
<tr>
<td><a href="https://covidtracking.com/">covidtracking.com</a></td>
<td><a href="https://renkulab.io/projects/covid-19/covid-19-public-data/datasets/c8bec148-5332-4602-9dc3-e39bbe92ed67/">covidtracking</a></td>
<td><code>data/covidtracking</code></td>
Rok Roškar's avatar
Rok Roškar committed
<td><a href="https://renkulab.io/projects/covid-19/covid-19-public-data/files/blob/notebooks/covidtracking-dashboard.ipynb">dashboard</a></td>
</tr>
<tr>
<td><a href="https://github.com/openZH/covid_19">OpenData Zuerich</a></td>
<td><a href="https://renkulab.io/projects/covid-19/covid-19-public-data/datasets/c9295d7a-0380-4a1b-8731-5c36d76cb8e7/">openzh-covid-19</a></td>
<td><code>data/openzh-covid-19</code></td>
Rok Roškar's avatar
Rok Roškar committed
<td><a href="https://renkulab.io/projects/covid-19/covid-19-public-data/files/blob/notebooks/openzh-covid-19-dashboard.ipynb">dashboard</a></td>
</tr>
<tr>
<td><a href="https://github.com/pcm-dpc/COVID-19">Covid-19 data for Italy</a></td>
<td><a href="https://renkulab.io/projects/covid-19/covid-19-public-data/datasets/286c58b1-dbbc-4caa-a23a-fcb001d5ac51/">covid-19-italy</a></td>
<td><code>data/covid-19-italy</code></td>
<td><a href="https://renkulab.io/projects/covid-19/covid-19-public-data/files/blob/notebooks/examples/italy-examples/italy-notebook-example.ipynb">notebook</a>, 
    <a href="https://renkulab.io/projects/covid-19/covid-19-public-data/files/blob/notebooks/examples/italy-examples/italy-dashboard-example.ipynb">dashboard</a></td>
<tr>
<td><a href="https://github.com/echen102/COVID-19-TweetIDs">Covid-19 tweet IDs</a></td>
<td><a href="https://renkulab.io/projects/covid-19/covid-19-public-data/datasets/0fc08252-cb39-4b59-bc82-9b213ec0bec6/">covid-19-tweet-ids</a></td>
<td><code>data/covid-19-tweet-ids</code></td>
<td>N/A</td>
</tr>
Rok Roškar's avatar
Rok Roškar committed
### Covid-19 Data Repository JHU CSSE

This is a global Covid-19 dataset updated regularly from [Johns Hopkins
University Center for Systems Science and Engineering (JHU
CSSE)](https://github.com/CSSEGISandData/COVID-19). The
[dashboard](covid-19-public-data/files/blob/runs/Dashboard.run.ipynb) summarizes
this data in combination with population data from the world bank.

### Covid tracking crowdsourcing project

[Covid tracking](https://covidtracking.com) is a crowd-sourced dataset for US state-level data. It is updated by hand by an army of volunteers. 

### OpenData Zuerich

The [swiss cantonal data](https://github.com/openZH/covid_19) collected by the Zürich statistical office. Parts are updated manually, others are starting to become automated. 

### Case data for Italy

Detailed data compiled by the [Civil Protection of Italy](https://github.com/pcm-dpc/COVID-19).

### Covid-19 related tweet IDs

A collection of tweet-ids related to covid-19 from https://github.com/echen102/COVID-19-TweetIDs.

Rok Roškar's avatar
Rok Roškar committed
### General
- https://data.worldbank.org/indicator/SP.POP.TOTL
- https://worldmap.harvard.edu/data/geonode:country_centroids_az8

Rok Roškar's avatar
Rok Roškar committed
## Derived Dataset Summary

<table class="table">
<thead>
<tr>
<th>Dataset</th>
<th>Location</th>
<th>Code</th>
</tr>
</thead>
<tbody>
<tr>
<td>Case population rates</td>
<td><code>data/covid-19_rates</code></td>
<td><a href="https://renkulab.io/projects/covid-19/covid-19-public-data/files/blob/notebooks/process/ToRates.ipynb">notebooks/process/ToRates.ipynb</a></td>
</tr>
</tbody>
</table>
Rok Roškar's avatar
Rok Roškar committed

## Contributing

If you are interested in working on this project, we would love to get
contributions. We would really like to collect more data sources and make them
available here! Please provide ideas for data sources that are relevant to
understanding covid-19. 

If you want to add a new datasource yourself, see the section [Adding a new data
source](#adding-a-new-data-source)

## Data Sources to Add
Rok Roškar's avatar
Rok Roškar committed
See the [data sources issue](https://renkulab.io/projects/covid-19/covid-19-public-data/collaboration/issues/1/).

## Adding a new data source

Adding a new data source is easy! To do so, in your fork or branch of the project, do the following:

* Create a renku dataset using `renku dataset create [dataset name]`
* Add any files or folders using `renku dataset add`. [Looking in the commit history will provide some examples](https://renkulab.io/gitlab/covid-19/covid-19-public-data/commits/master).
* Create a notebook that shows how to read and work with the dataset in the `notebooks/examples` folder
    * Protip: use a unique name for the notebook to avoid merge conflicts
* Add an issue to the project for any suggestions on things to do with the data