Skip to content
Snippets Groups Projects
Commit 71e50792 authored by Chandrasekhar Ramakrishnan's avatar Chandrasekhar Ramakrishnan Committed by renku 0.9.1
Browse files

renku update

parent 6021fe80
No related branches found
No related tags found
No related merge requests found
class: Workflow
cwlVersion: v1.0
hints: []
inputs:
input_1:
default: ts_folder
streamable: false
type: string
input_10:
default:
class: Directory
listing: []
path: ../../data/covid-19_jhu-csse
streamable: false
type: Directory
input_11:
default: worldmap_path
streamable: false
type: string
input_12:
default:
class: File
path: ../../data/worldmap/country_centroids.csv
streamable: false
type: File
input_13:
default: out_folder
streamable: false
type: string
input_14:
default:
class: Directory
listing: []
path: ../../data/geodata
streamable: false
type: Directory
input_15:
default:
class: File
path: ../../notebooks/process/CompileGeoData.ipynb
streamable: false
type: File
input_16:
default: runs/CompileGeoData.run.ipynb
streamable: false
type: string
input_2:
default:
class: Directory
listing: []
path: ../../data/covid-19_jhu-csse
streamable: false
type: Directory
input_3:
default: rates_folder
streamable: false
type: string
input_4:
default:
class: Directory
listing: []
path: ../../data/covid-19_rates
streamable: false
type: Directory
input_5:
default: geodata_path
streamable: false
type: string
input_6:
default:
class: File
path: ../../data/geodata/geo_data.csv
streamable: false
type: File
input_7:
default:
class: File
path: ../../notebooks/Dashboard.ipynb
streamable: false
type: File
input_8:
default: runs/Dashboard.run.ipynb
streamable: false
type: string
input_9:
default: ts_folder
streamable: false
type: string
outputs:
output_0:
outputSource: step_1/output_0
streamable: false
type: File
output_1:
outputSource: step_2/output_0
streamable: false
type: File
requirements: []
steps:
step_1:
in:
input_1: input_1
input_2: input_2
input_3: input_3
input_4: input_4
input_5: input_5
input_6: input_6
input_7: input_7
input_8: input_8
out:
- output_0
run: 5ae9a9961e194e7795df04a9722452e8_papermill.cwl
step_2:
in:
input_1: input_9
input_2: input_10
input_3: input_11
input_4: input_12
input_5: input_13
input_6: input_14
input_7: input_15
input_8: input_16
out:
- output_0
run: a8b2f47629164158a118963ae58eea3b_papermill.cwl
%% Cell type:markdown id: tags:
# Extract the Geographic Info
Use the Harvard [country_centroids.csv](https://worldmap.harvard.edu/data/geonode:country_centroids_az8) data to extract the geographic info we need for the visualizations.
%% Cell type:code id: tags:
``` python
import pandas as pd
import os
```
%% Cell type:code id: tags:
``` python
ts_folder = "../data/covid-19_jhu-csse/"
worldmap_path = "../data/worldmap/country_centroids.csv"
out_folder = None
PAPERMILL_OUTPUT_PATH = None
```
%% Cell type:markdown id: tags:parameters
## Read in JHU CSSE data
%% Cell type:code id: tags:injected-parameters
``` python
# Parameters
PAPERMILL_INPUT_PATH = "notebooks/process/CompileGeoData.ipynb"
PAPERMILL_INPUT_PATH = "/tmp/ps76102a/notebooks/process/CompileGeoData.ipynb"
PAPERMILL_OUTPUT_PATH = "runs/CompileGeoData.run.ipynb"
ts_folder = "./data/covid-19_jhu-csse/"
worldmap_path = "./data/worldmap/country_centroids.csv"
out_folder = "./data/geodata/"
ts_folder = "/tmp/ps76102a/data/covid-19_jhu-csse"
worldmap_path = "/tmp/ps76102a/data/worldmap/country_centroids.csv"
out_folder = "/tmp/ps76102a/data/geodata"
```
%% Cell type:code id: tags:
``` python
def read_jhu_covid_region_df(name):
filename = os.path.join(ts_folder, f"time_series_19-covid-{name}.csv")
df = pd.read_csv(filename)
df = df.set_index(['Country/Region', 'Province/State', 'Lat', 'Long'])
df.columns = pd.to_datetime(df.columns)
region_df = df.groupby(level='Country/Region').sum()
return region_df
```
%% Cell type:code id: tags:
``` python
confirmed_df = read_jhu_covid_region_df("Confirmed")
```
%% Cell type:markdown id: tags:
# Read in Harvard country centroids
%% Cell type:code id: tags:
``` python
country_centroids_df = pd.read_csv(worldmap_path)
country_centroids_df = country_centroids_df[['name', 'name_long', 'region_un', 'subregion', 'region_wb', 'pop_est', 'gdp_md_est', 'income_grp', 'Longitude', 'Latitude']]
country_centroids_df['name_jhu'] = country_centroids_df['name_long']
```
%% Cell type:code id: tags:
``` python
country_centroids_df.columns
```
%% Output
Index(['name', 'name_long', 'region_un', 'subregion', 'region_wb', 'pop_est',
'gdp_md_est', 'income_grp', 'Longitude', 'Latitude', 'name_jhu'],
dtype='object')
%% Cell type:markdown id: tags:
Fix names that differ between JHU CSSE and Harvard data
%% Cell type:code id: tags:
``` python
region_hd_jhu_map = {
'Brunei Darussalam': 'Brunei',
"Côte d'Ivoire": "Cote d'Ivoire",
'Czech Republic': 'Czechia',
'Hong Kong': 'Hong Kong SAR',
'Republic of Korea': 'Korea, South',
'Macao': 'Macao SAR',
'Russian Federation': 'Russia',
'Taiwan': 'Taiwan*',
'United States': 'US'
}
country_centroids_df['name_jhu'] = country_centroids_df['name_jhu'].replace(region_hd_jhu_map)
```
%% Cell type:code id: tags:
``` python
# Use this to find the name in the series
# country_centroids_df[country_centroids_df['name'].str.contains('Macao')]
```
%% Cell type:markdown id: tags:
There are some regions that we cannot resolve, but we will just ignore these.
%% Cell type:code id: tags:
``` python
confirmed_df.loc[
(confirmed_df.index.isin(country_centroids_df['name_jhu']) == False)
].iloc[:,-2:]
```
%% Output
2020-03-16 2020-03-17
Country/Region
Congo (Brazzaville) 1 1
Congo (Kinshasa) 2 3
Cruise Ship 696 696
Eswatini 1 1
Holy See 1 1
Martinique 15 16
North Macedonia 18 26
Republic of the Congo 1 1
The Bahamas 1 1
%% Cell type:markdown id: tags:
# Save the result
%% Cell type:code id: tags:
``` python
if PAPERMILL_OUTPUT_PATH:
out_path = os.path.join(out_folder, f"geo_data.csv")
country_centroids_df.to_csv(out_path)
```
......
source diff could not be displayed: it is too large. Options to address this: view the blob.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment