{"id":727,"date":"2020-04-18T14:38:41","date_gmt":"2020-04-18T18:38:41","guid":{"rendered":"http:\/\/www.aibistin.com\/?p=727"},"modified":"2023-03-12T16:05:08","modified_gmt":"2023-03-12T20:05:08","slug":"nyc-covid-19-infections-by-zip-code","status":"publish","type":"post","link":"https:\/\/www.aibistin.com\/?p=727","title":{"rendered":"NYC Covid-19 Infections by Zip Code, with Perl"},"content":{"rendered":"\n<p>The NYC Department of Health started publishing their Covid-19 test testing results on <a href=\"https:\/\/github.com\/nychealth\" target=\"_blank\" rel=\"noopener noreferrer\">GitHub .<\/a> One of their datasets <a href=\"https:\/\/github.com\/nychealth\/coronavirus-data#tests-by-zctacsv\" target=\"_blank\" rel=\"noopener noreferrer\">tests-by-zctascv<\/a> is, in their own words.<\/p>\n<blockquote>\n<p>This file includes the cumulative count of New York City residents by ZIP code of residence who:<br>Were ever tested for COVID-19 (SARS-CoV-2)<br>Tested positive The cumulative counts are as of the date of extraction from the NYC Health Department&#8217;s disease surveillance database.<\/p>\n<\/blockquote>\n<figure id=\"attachment_730\" aria-describedby=\"caption-attachment-730\" style=\"width: 801px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/Screenshot_2020-04-18-nychealth-coronavirus-data.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-730 size-full\" src=\"http:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/Screenshot_2020-04-18-nychealth-coronavirus-data.png\" alt=\"tests-by-zcta.csv\" width=\"801\" height=\"700\" srcset=\"https:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/Screenshot_2020-04-18-nychealth-coronavirus-data.png 801w, https:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/Screenshot_2020-04-18-nychealth-coronavirus-data-300x262.png 300w, https:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/Screenshot_2020-04-18-nychealth-coronavirus-data-768x671.png 768w\" sizes=\"auto, (max-width: 801px) 100vw, 801px\" \/><\/a><figcaption id=\"caption-attachment-730\" class=\"wp-caption-text\">GitHub View of &#8220;tests-by-zcta.csv&#8221;<\/figcaption><\/figure>\n<p>This file is updated almost every day and shows the number of people tested, the number who are found to have Covid-19 in each New York City Zip code. It also shows the the cumulative percentage of those tested who have the virus.&nbsp;<\/p>\n<p>What I would like to add, is more detailed information for each Zip Code so that it makes more sense to me. For each zip code, I would like to add the borough, the town, or district in that borough.&nbsp; To make things a little more complicated,&nbsp; NYC boroughs are divided up differently. Manhattan addresses are &#8220;New York City&#8221;, Brooklyn, Bronx and Staten Island are their own cities for mailing address purposes. Queens however is different.&nbsp; Queens is broken up into towns like Flushing and Long Island City, Woodside, Jamaica etc.&nbsp;<\/p>\n<p>In a previous post <a href=\"http:\/\/www.aibistin.com\/?p=673\">Creating A Simple JSON NYC Zip Code Database File With Perl and MooX::Options<\/a> , I created a little database file to match the zip codes with the neighbourhood.<\/p>\n<p>Now I created a new script to download the raw <a href=\"https:\/\/raw.githubusercontent.com\/nychealth\/coronavirus-data\/master\/tests-by-zcta.csv\" target=\"_blank\" rel=\"noopener noreferrer\">raw csv data<\/a>&nbsp;from the NYC Department Of Health GitHub page and merge it with my little Zip Code database.<\/p>\n<p><a href=\"https:\/\/github.com\/aibistin\/covid\" target=\"_blank\" rel=\"noopener noreferrer\">See the code on GitHub<\/a><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: perl; gutter: false; title: ; notranslate\" title=\"\">\nsub get_raw_covid_data_by_zip {\n    my $self = shift;\n    my @data =\n      map { _conv_zcta_rec_to_hash($_) }\n      split( \/\\r?\\n\/, get( $self-&gt;zcta_github_link ) );\n    shift @data\n      if ( $data&#x5B;0]-&gt;{cumulative_percent_of_those_tested} =~ \/zcta_cum\/ )\n      ;    # Dont need that header\n    say &quot;Got @{&#x5B; scalar @data ]} lines of covid data. Thanks Mr. Mayor&quot;;\n    return \\@data;\n}\n<\/pre><\/div>\n\n\n<p>The above function uses the CPAN module <a href=\"https:\/\/metacpan.org\/pod\/LWP::Simple\" target=\"_blank\" rel=\"noopener noreferrer\">LWP::Simple<\/a> which exports the &#8216;get&#8217; function to download the data from GitHub. The &#8216;split&#8217; function breaks the data up into individual lines, which are fed into the &#8216;map&#8217; function where each individual line of data is passed into &#8216;_conv_zcta_rec_to_hash&#8217; which breaks the line into a Hash, which is enriched with some extra Zip Code location information.<\/p>\n<p>&nbsp;<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: perl; gutter: false; title: ; notranslate\" title=\"\">\nsub _conv_zcta_rec_to_hash {\n    my $str = shift;\n    state $date_h = _get_date_h();\n    my %h;\n    (\n        $h{zip}, $h{positive}, $h{total_tested},\n        $h{cumulative_percent_of_those_tested}\n    ) = split \/\\s*,\\s*\/, $str;\n\n    ( $h{zip} ) = $h{zip} =~ \/(\\d+)\/;\n    $h{zip} ||= $NA_ZIP;    # There is one undef zip in test data\n    $h{yyyymmdd} = $date_h-&gt;{yyyymmdd};\n    return \\%h;\n}\n<\/pre><\/div>\n\n\n<p>Here&#8217;s a sample of one line of data as a hash element.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; gutter: false; title: ; notranslate\" title=\"\">\n{\n\u00a0\u00a0\u00a0\u00a0 cumulative_percent_of_those_tested =&gt; &quot;42.44&quot;,\n\u00a0\u00a0\u00a0\u00a0 positive\u00a0\u00a0\u00a0\u00a0 =&gt; &quot;337&quot;,\n\u00a0\u00a0\u00a0\u00a0 total_tested =&gt; &quot;794&quot;,\n\u00a0\u00a0\u00a0\u00a0 yyyymmdd \u00a0 \u00a0 =&gt; &quot;20200418&quot;,\n\u00a0\u00a0\u00a0\u00a0 zip \u00a0 \u00a0 \u00a0 \u00a0\u00a0 =&gt; &quot;10003&quot;,\n},\n<\/pre><\/div>\n\n\n<p>The newly created array of hashes is then serialized to JSON format and printed to a file using <a href=\"https:\/\/metacpan.org\/pod\/File::Serialize\" target=\"_blank\" rel=\"noopener noreferrer\">File::Serialize<\/a> . This will be my file database that I can use to provide other useful information.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; gutter: false; highlight: [6]; title: ; notranslate\" title=\"\">\nsub create_latest_tests_by_ztca_file {\n    my $self       = shift;\n   \n my $covid_data = $self-&gt;get_raw_covid_data_by_zip();\n \n   serialize_file $self-&gt;tests_by_zcta_db_json_file =&gt; $covid_data;\n \n   say &quot;Created a new &quot; . $self-&gt;tests_by_zcta_db_json_file;\n    1;\n}\n<\/pre><\/div>\n\n\n<h3 class=\"wp-block-heading\">Printing the test results to a CSV file.<\/h3>\n\n\n\n<p>Printing this to a C.S.V file is easy enough with Perl and <a href=\"https:\/\/metacpan.org\/pod\/Text::CSV_XS\" target=\"_blank\" rel=\"noopener noreferrer\">Text::CSV_XS.<\/a><\/p>\n<p><a href=\"http:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/covid_csv.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-746 size-full\" src=\"http:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/covid_csv.png\" alt=\"\" width=\"906\" height=\"237\" srcset=\"https:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/covid_csv.png 906w, https:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/covid_csv-300x78.png 300w, https:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/covid_csv-768x201.png 768w\" sizes=\"auto, (max-width: 906px) 100vw, 906px\" \/><\/a><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: perl; gutter: false; title: ; notranslate\" title=\"\">\nsub write_latest_zcta_to_csv {\n    my ($self) = @_;\n    my @col_headers = (\n        qw\/Zip Date City District Borough\/,\n        'Total Tested', 'Positive', '% of Tested'\n    );\n    my @col_names = (\n        qw\/zip yyyymmdd city district borough total_tested positive cumulative_percent_of_those_tested \/\n    );\n    my $csv       = Text::CSV_XS-&gt;new( { binary =&gt; 1, eol =&gt; $\/ } );\n    my $zcta_file = $self-&gt;get_todays_csv_file($ALL_ZCTA_DATA_CSV);\n    my $z_fh      = $zcta_file-&gt;openw;\n    $csv-&gt;print( $z_fh, \\@col_headers ) or $csv-&gt;error_diag;\n\n    for my $one_day_zip_rec (\n        sort { $b-&gt;{positive} &lt;=&gt; $a-&gt;{positive} || $a-&gt;{zip} &lt;=&gt; $b-&gt;{zip} }\n        @{ $self-&gt;tests_by_zcta_today } )\n    {\n        my $location_rec =\n          $self-&gt;zip_db-&gt;zip_db_hash-&gt;{ $one_day_zip_rec-&gt;{zip} }\n          || _get_filler_location_rec( $one_day_zip_rec-&gt;{zip} );\n        $self-&gt;zip_db-&gt;zip_db_hash-&gt;{ $one_day_zip_rec-&gt;{zip} } ||=\n          $location_rec;\n        my %csv_rec = ( %$one_day_zip_rec, %$location_rec );\n        $csv-&gt;print( $z_fh, &#x5B; @csv_rec{@col_names} ] );\n    }\n    close($z_fh) or warn &quot;Failed to close $zcta_file&quot;;\n    say &quot;Created a new $zcta_file&quot;;\n}\n<\/pre><\/div>\n\n\n<p><code>my $zcta_file = $self-&gt;get_todays_csv_file($ALL_ZCTA_DATA_CSV);<\/code><\/p>\n<p>Uses a Moo attribute to return a csv file path with the current days timestamp.<\/p>\n<p><code>for my $one_day_zip_rec (<\/code><br><code>sort { $b-&gt;{positive} &lt;=&gt; $a-&gt;{positive} || $a-&gt;{zip} &lt;=&gt; $b-&gt;{zip} }<\/code><br><code>@{ $self-&gt;tests_by_zcta_today } )<\/code><br><code>{<\/code><code>...<\/code><\/p>\n<p>When reading the current days test results data, it is sorted by the positive results. Then it&#8217;s combined with the zip code location data for that zip code, and printed. <\/p>\n<p><code> my %csv_rec = ( %$one_day_zip_rec, %$location_rec );<\/code><br><code>$csv-&gt;print( $z_fh, [ @csv_rec{@col_names} ] );<\/code><\/p>\n<p>Below is a sample CSV file for April 17 2020.<\/p>\n\n\n\n<div class=\"wp-block-file\"><a id=\"wp-block-file--media-e705b707-af98-45e7-942b-9a93f81f4a37\" href=\"http:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/20200417_all_zcta_data.csv\">20200417_all_zcta_data<\/a><a href=\"http:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/20200417_all_zcta_data.csv\" class=\"wp-block-file__button wp-element-button\" download aria-describedby=\"wp-block-file--media-e705b707-af98-45e7-942b-9a93f81f4a37\">Download<\/a><\/div>\n\n\n\n<p>Next we can create nice Plotly charts to display the test results.<\/p>\n\n\n\n<!--nextpage-->\n\n\n\n<h3 class=\"wp-block-heading\">Display The Data With Plotly<\/h3>\n\n\n\n<p>We can use the command line interface to the Perl script to create nice HTML charts to display the results.&nbsp;<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; gutter: false; title: ; notranslate\" title=\"\">\nperl bin\\city_covid_data.pl -h\nUSAGE: city_covid_data.pl &#x5B;-h] &#x5B;long options ...]\n\n    -n --create_new_zcta_db     Create a new NYC Zip Cumulative Test 'A'\n                                JSON db for todays result\n    --show_zip_stats=&#x5B;Strings]  Get the available statistics of a given zip\n                                code or codes\n    -v --verbose                Print details\n    -c --write_zcta_to_csv      Print latest ZCTA data to csv, 'output\/\n                                all_zcta_data.csv'\n\n    --usage                     show a short help message\n    -h                          show a compact help message\n    --help                      show a long help message\n    --man                       show the manual\n<\/pre><\/div>\n\n\n<p><code> perl bin\\city_covid_data.pl --show_zip_stats 11368,10467,11373,11219<\/code><\/p>\n<p>The example here displays the test results for 4 New York City zip codes that happen to have the highest incidents of positive test results.&nbsp; To help display the results in a graphical way, I found this great Perl module, <a href=\"https:\/\/metacpan.org\/pod\/Chart::Plotly\" target=\"_blank\" rel=\"noopener noreferrer\">Chart::Plotly<\/a>&nbsp;and <a href=\"https:\/\/metacpan.org\/pod\/Chart::Plotly::Trace::Bar\" target=\"_blank\" rel=\"noopener noreferrer\">Chart::Plotly::Trace::Bar<\/a> , which is a Perl interface to the <a href=\"https:\/\/plotly.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">Plotly<\/a>&nbsp; JavaScript library.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: perl; gutter: false; title: ; notranslate\" title=\"\">\nsub show_stats_for_zips {\n    my ( $self, $zip_codes ) = @_;\n\n    my @chart_names = ref($zip_codes) eq 'ARRAY' ? @{$zip_codes} : ($zip_codes);\n    my $date_conv_func   = $self-&gt;date_to_str_func();\n    my $stats_cache_func = _get_zip_chart_stats_cache_func();\n\n    my @charts;\n    for my $zip_code (@chart_names) {\n        my $zip_code_stats = $stats_cache_func-&gt;( $self, $zip_code );\n        my $chart = Chart::Plotly::Trace::Bar-&gt;new(\n            x =&gt; &#x5B;\n                map { $date_conv_func-&gt;($_) }\n                  @{ $zip_code_stats-&gt;{dates} || &#x5B;] }\n            ],\n            y =&gt; &#x5B; @{ $zip_code_stats-&gt;{positive} || &#x5B;] } ],\n            name =&gt; $self-&gt;city_district($zip_code),\n            text =&gt; $zip_code,\n        );\n        push @charts, $chart;\n    }\n\t\n    my $bar_chart = Chart::Plotly::Plot-&gt;new(\n        traces =&gt; &#x5B;@charts],\n        layout =&gt; { barmode =&gt; 'group' }\n    );\n\n    Chart::Plotly::show_plot($bar_chart);\n}\n<\/pre><\/div>\n\n\n<p>You can download the HTML chart below.<\/p>\n\n\n\n<div class=\"wp-block-file\"><a id=\"wp-block-file--media-87d9f4ed-6071-475f-8e94-ce50e77b99e4\" href=\"http:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/razHLMp7kO.html\">razHLMp7kO<\/a><a href=\"http:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/razHLMp7kO.html\" class=\"wp-block-file__button wp-element-button\" download aria-describedby=\"wp-block-file--media-87d9f4ed-6071-475f-8e94-ce50e77b99e4\">Download<\/a><\/div>\n\n\n\n\n\n<p>The results are displayed in a nice HTML file with the Plotly chart.<\/p>\n<p><a href=\"http:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/Screenshot_2020-04-18-Screenshot.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-762 size-full\" src=\"http:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/Screenshot_2020-04-18-Screenshot.png\" alt=\"\" width=\"1904\" height=\"450\" srcset=\"https:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/Screenshot_2020-04-18-Screenshot.png 1904w, https:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/Screenshot_2020-04-18-Screenshot-300x71.png 300w, https:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/Screenshot_2020-04-18-Screenshot-1024x242.png 1024w, https:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/Screenshot_2020-04-18-Screenshot-768x182.png 768w, https:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/Screenshot_2020-04-18-Screenshot-1536x363.png 1536w, https:\/\/www.aibistin.com\/wp-content\/uploads\/2020\/04\/Screenshot_2020-04-18-Screenshot-1200x284.png 1200w\" sizes=\"auto, (max-width: 1904px) 100vw, 1904px\" \/><\/a><\/p>\n<p>The code: <a href=\"https:\/\/github.com\/aibistin\/covid\">https:\/\/github.com\/aibistin\/covid<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The NYC Department of Health started publishing their Covid-19 test testing results on GitHub . One of their datasets tests-by-zctascv is, in their own words. This file includes the cumulative count of New York City residents by ZIP code of residence who:Were ever tested for COVID-19 (SARS-CoV-2)Tested positive The cumulative counts are as of the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[105,41,103,101,99],"tags":[54,59,47,61,57,95,58,20,55,60,56,50],"class_list":["post-727","post","type-post","status-publish","format-standard","hentry","category-csv","category-mooxoptions-perl","category-new-york-city","category-perl","category-programming","tag-chartplotly","tag-covid-19","tag-fileserialize","tag-lwpsimple","tag-moo-mooxoptions","tag-new-york-city","tag-nyc","tag-perl","tag-plotly","tag-queens","tag-textcsv_xs","tag-zipcodes"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.aibistin.com\/index.php?rest_route=\/wp\/v2\/posts\/727","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.aibistin.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aibistin.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aibistin.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aibistin.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=727"}],"version-history":[{"count":25,"href":"https:\/\/www.aibistin.com\/index.php?rest_route=\/wp\/v2\/posts\/727\/revisions"}],"predecessor-version":[{"id":902,"href":"https:\/\/www.aibistin.com\/index.php?rest_route=\/wp\/v2\/posts\/727\/revisions\/902"}],"wp:attachment":[{"href":"https:\/\/www.aibistin.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=727"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aibistin.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=727"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aibistin.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=727"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}