Skip to content
Snippets Groups Projects
Commit 3d681f74 authored by CI-bot's avatar CI-bot Committed by renku 0.10.4
Browse files

renku rerun data/covidtracking/states-metadata.json data/covidtracking/states-daily.json

parent c70672e2
No related branches found
No related tags found
1 merge request!386Automatic update - auto-update_2020-12-01_08-14
Pipeline #117649 failed with stages
in 47 minutes and 37 seconds
class: Workflow
cwlVersion: v1.0
hints: []
inputs:
input_1:
default: out_folder
streamable: false
type: string
input_2:
default: data/covidtracking
streamable: false
type: string
input_3:
default:
class: File
path: ../../notebooks/process/download-covidtracking-data.ipynb
streamable: false
type: File
input_4:
default: runs/download-covidtracking-data.runs.ipynb
streamable: false
type: string
input_5:
default: states-daily.json
streamable: false
type: string
input_6:
default: states-metadata.json
streamable: false
type: string
outputs:
output_1:
outputSource: step_1/output_1
streamable: false
type: Directory
output_3:
outputSource: step_1/output_0
streamable: false
type: File
requirements: []
steps:
step_1:
in:
input_1: input_1
input_2: input_2
input_3: input_3
input_4: input_4
out:
- output_1
- output_0
run: a17d560c41a54f5aa307ce5f3c5effe5_papermill.cwl
step_2:
in:
filename: input_5
input_directory: step_1/output_1
out:
- output_file
run:
arguments: []
baseCommand:
- 'true'
class: CommandLineTool
cwlVersion: v1.0
hints: []
inputs:
filename:
default: states-daily.json
streamable: false
type: string
input_directory:
streamable: false
type: Directory
outputs:
output_file:
outputBinding:
glob: $(inputs.filename)
streamable: false
type: File
permanentFailCodes: []
requirements:
- &id001
class: InlineJavascriptRequirement
- &id002
class: InitialWorkDirRequirement
listing: $(inputs.input_directory.listing)
successCodes: []
temporaryFailCodes: []
step_3:
in:
filename: input_6
input_directory: step_1/output_1
out:
- output_file
run:
arguments: []
baseCommand:
- 'true'
class: CommandLineTool
cwlVersion: v1.0
hints: []
inputs:
filename:
default: states-metadata.json
streamable: false
type: string
input_directory:
streamable: false
type: Directory
outputs:
output_file:
outputBinding:
glob: $(inputs.filename)
streamable: false
type: File
permanentFailCodes: []
requirements:
- *id001
- *id002
successCodes: []
temporaryFailCodes: []
source diff could not be displayed: it is stored in LFS. Options to address this: view the blob.
source diff could not be displayed: it is stored in LFS. Options to address this: view the blob.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import requests import requests
import os import os
import pandas as pd import pandas as pd
``` ```
%% Cell type:code id: tags:parameters %% Cell type:code id: tags:parameters
``` python ``` python
out_folder = "../data/covidtracking/" out_folder = "../data/covidtracking/"
PAPERMILL_OUTPUT_PATH = None PAPERMILL_OUTPUT_PATH = None
``` ```
%% Cell type:code id: tags:injected-parameters %% Cell type:code id: tags:injected-parameters
``` python ``` python
# Parameters # Parameters
PAPERMILL_INPUT_PATH = "/tmp/nfpct0h_/notebooks/process/download-covidtracking-data.ipynb" PAPERMILL_INPUT_PATH = "/tmp/c1zx9gql/notebooks/process/download-covidtracking-data.ipynb"
PAPERMILL_OUTPUT_PATH = "runs/download-covidtracking-data.runs.ipynb" PAPERMILL_OUTPUT_PATH = "runs/download-covidtracking-data.runs.ipynb"
out_folder = "data/covidtracking" out_folder = "data/covidtracking"
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Download state metadata # Download state metadata
Download a dataset of URLs for data for each US state and several territories. See [Google Doc](https://docs.google.com/spreadsheets/d/18oVRrHj3c183mHmq3m89_163yuYltLNlOmPerQ18E8w/htmlview?sle=true). Download a dataset of URLs for data for each US state and several territories. See [Google Doc](https://docs.google.com/spreadsheets/d/18oVRrHj3c183mHmq3m89_163yuYltLNlOmPerQ18E8w/htmlview?sle=true).
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
url = 'http://covidtracking.com/api/states/info' url = 'http://covidtracking.com/api/states/info'
r = requests.get(url, allow_redirects=True) r = requests.get(url, allow_redirects=True)
states_metadata_json = r.content states_metadata_json = r.content
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# save the result # save the result
if PAPERMILL_OUTPUT_PATH: if PAPERMILL_OUTPUT_PATH:
out_path = os.path.join(out_folder, 'states-metadata.json') out_path = os.path.join(out_folder, 'states-metadata.json')
with open(out_path, 'wb') as f: with open(out_path, 'wb') as f:
f.write(states_metadata_json) f.write(states_metadata_json)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
metadata_df = pd.read_json(states_metadata_json) metadata_df = pd.read_json(states_metadata_json)
print(len(metadata_df), "states and territories have metadata") print(len(metadata_df), "states and territories have metadata")
metadata_df.head(2) metadata_df.head(2)
``` ```
%% Output %% Output
56 states and territories have metadata 56 states and territories have metadata
state notes \ state notes \
0 AK Alaska combines PCR and antigen tests in the t... 0 AK Alaska combines PCR and antigen tests in the t...
1 AL Alabama combines PCR and antigen tests in the ... 1 AL Alabama combines PCR and antigen tests in the ...
covid19Site \ covid19Site \
0 http://dhss.alaska.gov/dph/Epi/id/Pages/COVID-... 0 http://dhss.alaska.gov/dph/Epi/id/Pages/COVID-...
1 https://alpublichealth.maps.arcgis.com/apps/op... 1 https://alpublichealth.maps.arcgis.com/apps/op...
covid19SiteSecondary \ covid19SiteSecondary \
0 https://experience.arcgis.com/experience/ed1c8... 0 https://experience.arcgis.com/experience/ed1c8...
1 https://alpublichealth.maps.arcgis.com/apps/op... 1 https://alpublichealth.maps.arcgis.com/apps/op...
covid19SiteTertiary \ covid19SiteTertiary \
0 https://alaska-dhss.maps.arcgis.com/apps/opsda... 0 https://alaska-dhss.maps.arcgis.com/apps/opsda...
1 https://services7.arcgis.com/4RQmZZ0yaZkGR1zy/... 1 https://services7.arcgis.com/4RQmZZ0yaZkGR1zy/...
covid19SiteQuaternary \ covid19SiteQuaternary \
0 https://services1.arcgis.com/WzFsmainVTuD5KML/... 0 https://services1.arcgis.com/WzFsmainVTuD5KML/...
1 None 1 None
covid19SiteQuinary twitter \ covid19SiteQuinary twitter \
0 https://services1.arcgis.com/WzFsmainVTuD5KML/... @Alaska_DHSS 0 https://services1.arcgis.com/WzFsmainVTuD5KML/... @Alaska_DHSS
1 None @alpublichealth 1 None @alpublichealth
covid19SiteOld \ covid19SiteOld \
0 http://dhss.alaska.gov/dph/Epi/id/Pages/COVID-... 0 http://dhss.alaska.gov/dph/Epi/id/Pages/COVID-...
1 http://www.alabamapublichealth.gov/infectiousd... 1 http://www.alabamapublichealth.gov/infectiousd...
covidTrackingProjectPreferredTotalTestUnits \ covidTrackingProjectPreferredTotalTestUnits \
0 Specimens 0 Specimens
1 Unclear units 1 Unclear units
covidTrackingProjectPreferredTotalTestField totalTestResultsField pui \ covidTrackingProjectPreferredTotalTestField totalTestResultsField pui \
0 totalTestsViral Total Tests (PCR) All data 0 totalTestsViral Total Tests (PCR) All data
1 totalTestsViral Total Tests (PCR) No data 1 totalTestsViral Total Tests (PCR) No data
pum name fips pum name fips
0 False Alaska 2 0 False Alaska 2
1 False Alabama 1 1 False Alabama 1
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Download daily state data # Download daily state data
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
url = 'https://covidtracking.com/api/states/daily' url = 'https://covidtracking.com/api/states/daily'
r = requests.get(url, allow_redirects=True) r = requests.get(url, allow_redirects=True)
states_daily_json = r.content states_daily_json = r.content
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# save the result # save the result
if PAPERMILL_OUTPUT_PATH: if PAPERMILL_OUTPUT_PATH:
out_path = os.path.join(out_folder, 'states-daily.json') out_path = os.path.join(out_folder, 'states-daily.json')
with open(out_path, 'wb') as f: with open(out_path, 'wb') as f:
f.write(states_daily_json) f.write(states_daily_json)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
data_df = pd.read_json(states_daily_json) data_df = pd.read_json(states_daily_json)
print(len(data_df), "data points") print(len(data_df), "data points")
data_df.head(2) data_df.head(2)
``` ```
%% Output %% Output
15241 data points 15297 data points
date state positive probableCases negative pending \ date state positive probableCases negative pending \
0 20201129 AK 30816.0 NaN 975364.0 NaN 0 20201130 AK 31323.0 NaN 980073.0 NaN
1 20201129 AL 247229.0 41286.0 1373770.0 NaN 1 20201130 AL 249524.0 41501.0 1376324.0 NaN
totalTestResultsSource totalTestResults hospitalizedCurrently \ totalTestResultsSource totalTestResults hospitalizedCurrently \
0 totalTestsViral 1006180.0 159.0 0 totalTestsViral 1011396.0 162.0
1 totalTestsViral 1579713.0 1609.0 1 totalTestsViral 1584347.0 1717.0
hospitalizedCumulative ... posNeg deathIncrease hospitalizedIncrease \ hospitalizedCumulative ... posNeg deathIncrease hospitalizedIncrease \
0 722.0 ... 1006180 0 1 0 725.0 ... 1011396 0 3
1 24670.0 ... 1620999 5 0 1 25338.0 ... 1625848 1 668
hash commercialScore \ hash commercialScore \
0 81a1922227c01f54d1d8cc7e718af55ee8b6804b 0 0 271d3c383c856d19d56e16fbd4efca0f728d6f5a 0
1 c6fbe324336aa11a724d65ccedc8971e5509bb4e 0 1 1d4953d87fd314f6eb3ff436b379ce3b6ff4fab7 0
negativeRegularScore negativeScore positiveScore score grade negativeRegularScore negativeScore positiveScore score grade
0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 1 0 0 0 0
[2 rows x 55 columns] [2 rows x 55 columns]
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment