Display COVID-19 numbers relative to population

Dec 13, 2020 · 14 min read · python covid-19 plotly ·

The total number of cases of COVID-19 and deaths from COVID-19 in a country will vary based on a number of factors. It does not seem right to compare absolute numbers in different countries, where the populations of those countries varies greatly. This article will show how to take the absolute numbers and express the confirmed cases and deaths per 100,000 of the population and thus allow comparison between countries.

Data sources:
The data used in this article is retrieved from the Johns Hopkins University who have made the data available on GitHub. More information about COVID-19 and the coronavirus is available from Coronavirus disease (COVID-19) advice for the public.
Population data is obtained from the World Bank data

Load COVID-19 data into dataframes

There are two sets of data to load the global confirmed cases and the global deaths.

time_series_covid19_deaths_global.csv
time_series_covid19_confirmed_global.csv

The data can be loaded directly from the GitHub page as in the code below. The data is cleaned to remove unwanted fields and group the data by country with the following function.

 1def load_clean_covid_data(csv_path):
 2    df = pd.read_csv(csv_path)
 3
 4    # 1. Drop unwanted columns
 5    df.drop(['Province/State', 'Lat', 'Long'], axis = 1, inplace = True)
 6
 7    # 2. Group by Country
 8    df = df.groupby('Country/Region').sum()
 9
10    # 3. Convert column headings to datetime
11    df.columns = pd.to_datetime(df.columns)
12
13    # 4. Remove the column name
14    df.rename_axis(None, axis = 0, inplace = True)
15
16    return df

Load the data for Confirmed cases and the data for Deaths from COVID-19 into dataframes.

 1
 2confirmed_df = load_clean_data('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv')
 3confirmed_df.shape
 4"""
 5(191, 326)
 6"""
 7
 8confirmed_df.iloc[[0,1,2,-3,-2,-1], [0,1,2,-3,-2,-1]]
 9"""
10             2020-01-22  2020-01-23  2020-01-24  2020-12-10  2020-12-11  2020-12-12
11Afghanistan           0           0           0       48053       48116       48229
12Albania               0           0           0       46061       46863       47742
13Algeria               0           0           0       90579       91121       91638
14Yemen                 0           0           0        2081        2082        2083
15Zambia                0           0           0       18091       18161       18217
16Zimbabwe              0           0           0       11081       11162       11219
17"""
18
19
20deaths_df = load_clean_data('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv')
21deaths_df.shape
22"""
23(191, 326)
24"""
25
26deaths_df.iloc[[0,1,2,-3,-2,-1], [0,1,2,-3,-2,-1]]
27"""
28             2020-01-22  2020-01-23  2020-01-24  2020-12-10  2020-12-11  2020-12-12  
29Afghanistan           0           0           0        1935        1945        1956  
30Albania               0           0           0         965         977         989  
31Algeria               0           0           0        2564        2575        2584  
32Yemen                 0           0           0         606         606         606  
33Zambia                0           0           0         364         365         366  
34Zimbabwe              0           0           0         305         306         307  
35"""

Load global population data into dataframe

World population data is available from the World Bank data. This is available to download in csv, xml or Excel format. The Excel file contains three worksheets with the population data in the first sheet Data. This is loaded into a dataframe.

1pop_xl = pd.ExcelFile("/tmp/API_SP.POP.TOTL_DS2_en_excel_v2_1740169.xls", engine = "xlrd")
2pop_xl.sheet_names
3"""
4['Data', 'Metadata - Countries', 'Metadata - Indicators']
5"""

 1# Load the excel worksheet into a dataframe
 2pop_df = pd.read_excel(
 3    "/tmp/API_SP.POP.TOTL_DS2_en_excel_v2_1740169.xls",
 4    engine = "xlrd",
 5    sheet_name = 'Data',
 6    header = 3)
 7
 8pop_df.shape
 9"""
10(264, 65)
11"""
12
13pop_df.iloc[[0,1,2,-3,-2,-1], [0,1,2,-3,-2,-1]]
14"""
15     Country Name Country Code     Indicator Name        2018        2019  2020  
160           Aruba          ABW  Population, total    105845.0    106314.0   NaN  
171     Afghanistan          AFG  Population, total  37172386.0  38041754.0   NaN  
182          Angola          AGO  Population, total  30809762.0  31825295.0   NaN  
19261  South Africa          ZAF  Population, total  57779622.0  58558270.0   NaN  
20262        Zambia          ZMB  Population, total  17351822.0  17861030.0   NaN  
21263      Zimbabwe          ZWE  Population, total  14439018.0  14645468.0   NaN  
22"""

Extract population data for 2019 into a dataframe.

 1# Get population for countries for 2019
 2pop_2019_df = pop_df[['Country Name', '2019']].copy()
 3
 4pop_2019_df.shape
 5"""
 6(264, 2)
 7"""
 8
 9pop_2019_df.iloc[[0,1,2,3,4,-5,-4-3,-2,-1], :]
10"""
11    Country Name          2019
120          Aruba  1.063140e+05
131    Afghanistan  3.804175e+07
142         Angola  3.182530e+07
153        Albania  2.854191e+06
164        Andorra  7.714200e+04
17259       Kosovo  1.794248e+06
18257        World  7.673534e+09
19262       Zambia  1.786103e+07
20263     Zimbabwe  1.464547e+07
21"""

Load population data for a single year

After exploring the population dataset, the code above can be simplified to just load data for a single year. The population data for the year 2019 can be loaded directly from the Excel file. The Country Code cannot be used, as it is not present in the COVID-19 data.

 1# Load country and data for 2019 from the excel worksheet into a dataframe
 2pop_2019_df = pd.read_excel(
 3    "/tmp/API_SP.POP.TOTL_DS2_en_excel_v2_1740169.xls",
 4    engine = "xlrd",
 5    sheet_name = 'Data',
 6    usecols = ['Country Name', '2019'],
 7    header = 3)
 8
 9pop_2019_df.shape
10"""
11(264, 2)
12"""
13
14pop_2019_df.iloc[[0,1,2,3,4,-5,-4-3,-2,-1], :]
15"""
16    Country Name          2019
170          Aruba  1.063140e+05
181    Afghanistan  3.804175e+07
192         Angola  3.182530e+07
203        Albania  2.854191e+06
214        Andorra  7.714200e+04
22259       Kosovo  1.794248e+06
23257        World  7.673534e+09
24262       Zambia  1.786103e+07
25263     Zimbabwe  1.464547e+07
26"""

Compare countries in COVID-19 datasets with population dataset

Examine the matching of country names between population data and COVID-19 data as country codes are not used. This shows that there are 25 "Country/Region" that are not present in the population data.

 1# Countries in confirmed_df dataframe that are not in pop_2019_df
 2unmatched_covid_countries = sorted(
 3    list(
 4        confirmed_df[
 5            ~confirmed_df.index
 6            .isin(list(pop_2019_df['Country Name']))
 7        ].index))
 8
 9len(unmatched_covid_countries)
10"""
1125
12"""
13
14unmatched_covid_countries
15"""
16Bahamas                               Brunei                                Burma                                 
17Congo (Brazzaville)                   Congo (Kinshasa)                      Czechia                               
18Diamond Princess                      Egypt                                 Gambia                                
19Holy See                              Iran                                  Korea, South                          
20Kyrgyzstan                            Laos                                  MS Zaandam                            
21Russia                                Saint Kitts and Nevis                 Saint Lucia                           
22Saint Vincent and the Grenadines      Slovakia                              Syria                                 
23Taiwan*                               US                                    Venezuela                             
24Yemen                                 
25"""

Four of these "Country/Region" cannot be mapped to a country in the population data. Two of these are cruise ships, Taiwan* is a disputed territory and the Holy See may be included in smal states. These are removed from the list of unmatched countries.

 1unmatched_covid_countries.remove('Diamond Princess')
 2unmatched_covid_countries.remove('MS Zaandam')
 3unmatched_covid_countries.remove('Holy See')
 4unmatched_covid_countries.remove('Taiwan*')
 5
 6len(unmatched_covid_countries)
 7"""
 821
 9"""
10
11unmatched_covid_countries
12"""
13Bahamas                               Brunei                                Burma                                 
14Congo (Brazzaville)                   Congo (Kinshasa)                      Czechia                               
15Egypt                                 Gambia                                Iran                                  
16Korea, South                          Kyrgyzstan                            Laos                                  
17Russia                                Saint Kitts and Nevis                 Saint Lucia                           
18Saint Vincent and the Grenadines      Slovakia                              Syria                                 
19US                                    Venezuela                             Yemen                                 
20"""

The remaining "Country/Region" are manually mapped to Countries in the population dataset.

COVID-19 Country Name	Population Country Name
Bahamas	Bahamas, The
Brunei	Brunei Darussalam
Burma	Myanmar
Congo (Brazzaville)	Congo, Rep.
Congo (Kinshasa)	Congo, Dem. Rep.
Czechia	Czech Republic
Egypt	Egypt, Arab Rep.
Gambia	Gambia, The
Iran	Iran, Islamic Rep.
Korea, South	Korea, Rep.
Kyrgyzstan	Kyrgyz Republic
Laos	Lao PDR
Russia	Russian Federation
Saint Kitts and Nevis	St. Kitts and Nevis
Saint Lucia	St. Lucia
Saint Vincent and the Grenadines	St. Vincent and the Grenadines
Slovakia	Slovak Republic
Syria	Syrian Arab Republic
US	United States
Venezuela	Venezuela, RB
Yemen	Yemen, Rep.

Update country names in population dataset

A new column is added to the population dataset to contain the country name from the COVID-19 dataset if it is different or the original country if there is no difference. This could update the country name in place, but it is better to have the two names for verification.

 1map_pop = ['Bahamas, The', 'Brunei Darussalam', 'Myanmar', 'Congo, Rep.', 'Congo, Dem. Rep.',
 2           'Czech Republic', 'Egypt, Arab Rep.', 'Gambia, The', 'Iran, Islamic Rep.',
 3           'Korea, Rep.', 'Kyrgyz Republic', 'Lao PDR', 'Russian Federation', 'St. Kitts and Nevis',
 4           'St. Lucia', 'St. Vincent and the Grenadines', 'Slovak Republic', 'Syrian Arab Republic',
 5           'United States', 'Venezuela, RB', 'Yemen, Rep.'
 6          ]
 7
 8country_map = {}
 9for c,p in zip(unmatched_covid_countries, map_pop):
10    country_map[p] = c
11
12pop_2019_df['Covid_Country'] = pop_2019_df.apply(
13    lambda x: country_map[x['Country Name']]
14        if x['Country Name'] in country_map
15        else x['Country Name'],
16    axis = 1)
17
18pop_2019_df.iloc[[1,2,3,21,29,249], :]
19"""
20          Country Name         2019 Covid_Country
211          Afghanistan   38041754.0   Afghanistan
222               Angola   31825295.0        Angola
233              Albania    2854191.0       Albania
2421        Bahamas, The     389482.0       Bahamas
2529   Brunei Darussalam     433285.0        Brunei
26249      United States  328239523.0            US
27"""

Confirm that there are no unexpected countries in covid data that are not in population data.

 1list(confirmed_df[
 2    ~confirmed_df.index
 3    .isin(list(pop_2019_df['Covid_Country']))
 4].index)
 5"""
 6['Diamond Princess', 'Holy See', 'MS Zaandam', 'Taiwan*']
 7"""
 8
 9list(deaths_df[
10    ~deaths_df.index
11    .isin(list(pop_2019_df['Covid_Country']))
12].index)
13"""
14['Diamond Princess', 'Holy See', 'MS Zaandam', 'Taiwan*']
15"""

Calculate cases and deaths per 100,000 of population

Merge the population numbers for 2019 into the COVID datasets. Rename the column name for 2019 to Population. This is done for the Confirmed cases and the Deaths datasets.

 1# Merge Population data
 2confirmed_pop_df = confirmed_df.merge(
 3    pop_2019_df,
 4    left_index = True,
 5    left_on = confirmed_df.index,
 6    right_on = 'Covid_Country',
 7)
 8
 9# Rename '2019' column name to 'population'
10confirmed_pop_df = confirmed_pop_df.rename(columns={'2019': 'Population'})
11
12confirmed_pop_df.shape
13"""(187, 329)
14"""
15
16confirmed_pop_df.iloc[[0,1,2,-3,-2,-1], [0,1,2,-4,-3,-2,-1]]
17"""
18(187, 329)
19     2020-01-22 00:00:00  2020-01-23 00:00:00  2020-01-24 00:00:00  2020-12-12 00:00:00 Country Name  Population Covid_Country  
201                      0                    0                    0                48229  Afghanistan  38041754.0   Afghanistan  
213                      0                    0                    0                47742      Albania   2854191.0       Albania  
2258                     0                    0                    0                91638      Algeria  43053054.0       Algeria  
23260                    0                    0                    0                 2083  Yemen, Rep.  29161922.0         Yemen  
24262                    0                    0                    0                18217       Zambia  17861030.0        Zambia  
25263                    0                    0                    0                11219     Zimbabwe  14645468.0      Zimbabwe  
26"""

 1# Merge Population data
 2deaths_pop_df = deaths_df.merge(
 3    pop_2019_df,
 4    left_index = True,
 5    left_on = deaths_df.index,
 6    right_on = 'Covid_Country',
 7)
 8
 9# Rename '2019' column name to 'population'
10deaths_pop_df = deaths_pop_df.rename(columns={'2019': 'Population'})
11
12deaths_pop_df.shape
13"""(187, 329)
14"""
15
16deaths_pop_df.iloc[[0,1,2,-3,-2,-1], [0,1,2,-4,-3,-2,-1]]
17"""
18     2020-01-22 00:00:00  2020-01-23 00:00:00  2020-01-24 00:00:00  2020-12-12 00:00:00 Country Name  Population Covid_Country  
191                      0                    0                    0                 1956  Afghanistan  38041754.0   Afghanistan  
203                      0                    0                    0                  989      Albania   2854191.0       Albania  
2158                     0                    0                    0                 2584      Algeria  43053054.0       Algeria  
22260                    0                    0                    0                  606  Yemen, Rep.  29161922.0         Yemen  
23262                    0                    0                    0                  366       Zambia  17861030.0        Zambia  
24263                    0                    0                    0                  307     Zimbabwe  14645468.0      Zimbabwe  
25"""

To adjust the numbers of confirmed cases and deaths to 100,000 of the population, divide the number by the population and multiply by 100,000.

 1# Confirmed cases per 100,000 of population
 2confirmed_per_100k_df = confirmed_pop_df.copy()
 3col_names = list(confirmed_per_100k_df.columns)
 4[(col_names.remove(x)) for x in ['Country Name', 'Population', 'Covid_Country']]
 5confirmed_per_100k_df[col_names] = (
 6    confirmed_per_100k_df[col_names]
 7    .div(confirmed_per_100k_df['Population'], axis = 0)
 8    .mul(100000, axis = 0)
 9)
10
11confirmed_per_100k_df.shape
12"""
13(187, 329)
14"""
15
16confirmed_per_100k_df.iloc[[0,1,2,-3,-2,-1], [0,50,100,150,-4,-3,-2,-1]]
17"""
18     2020-01-22 00:00:00  2020-03-12 00:00:00  2020-05-01 00:00:00  2020-06-20 00:00:00  2020-12-12 00:00:00 Country Name  Population  Covid_Country  
191                    0.0             0.031544             6.022330            74.728416           126.779117  Afghanistan  38041754.0    Afghanistan  
203                    0.0             0.805833            27.398307            66.253450          1672.698148      Albania   2854191.0        Albania  
2158                   0.0             0.055745             9.648561            27.015505           212.849012      Algeria  43053054.0        Algeria  
22260                  0.0             0.000000             0.024004             3.161657             7.142876  Yemen, Rep.  29161922.0          Yemen  
23262                  0.0             0.000000             0.610267             8.006257           101.992998       Zambia  17861030.0         Zambia  
24263                  0.0             0.000000             0.273122             3.270636            76.603902     Zimbabwe  14645468.0       Zimbabwe  
25"""

1# Deaths per 100,000 of population
2deaths_per_100k_df = deaths_pop_df.copy()
3col_names = list(deaths_per_100k_df.columns)
4[(col_names.remove(x)) for x in ['Country Name', 'Population', 'Covid_Country']]
5deaths_per_100k_df[col_names] = (
6    deaths_per_100k_df[col_names]
7    .div(deaths_per_100k_df['Population'], axis = 0)
8    .mul(100000, axis = 0)
9)

Countries with highest confirmed cases per 100,000 of population

Countries with populations of less than one million were removed from the dataset. The data is sorted by the numbers of confirmed cases per 100,000 on the latest date and the data for the top 20 countries displayed in a table and bar chart. The changes over time for the top 10 countries is displayed in a line chart.

 1confirmed_per_100k_df = confirmed_per_100k_df[confirmed_per_100k_df['Population'] > 1000000]
 2
 3top_20_confirmed_df = (confirmed_per_100k_df.sort_values(
 4    by = confirmed_per_100k_df.columns[-4], ascending = False)
 5    .head(20)[['Covid_Country',
 6               'Population',
 7               pd.to_datetime('2020-12-12 00:00:00') ]])
 8
 9# Merge Confirmed data
10top_20_confirmed_df = top_20_confirmed_df.merge(
11    confirmed_df.iloc[:, [-1]],
12    right_index = True,
13    left_on = top_20_confirmed_df['Covid_Country'],
14    right_on = confirmed_df.index,
15)
16
17# Reset the index
18top_20_confirmed_df.reset_index(drop = True, inplace = True)
19
20# Drop the merged key
21top_20_confirmed_df = top_20_confirmed_df.drop(['key_0'], axis = 1)
22
23# Rename the column names
24top_20_confirmed_df = top_20_confirmed_df.rename(
25    columns = {'Covid_Country': 'Country',
26               '2020-12-12 00:00:00_x': 'Cases per 100k',
27               '2020-12-12 00:00:00_y': 'Total cases'})

Table: The top 20 countries with highest cases of COVID-19 on November 12 2020

Rank	Country	Population	Cases per 100k	Total cases
1	Bahrain	1,641,172	5,421	88,965
2	Czechia	10,669,709	5,393	575,422
3	Belgium	11,484,055	5,252	603,159
4	Georgia	3,720,382	5,027	187,006
5	Armenia	2,957,731	4,981	147,312
6	Qatar	2,832,067	4,973	140,827
7	US	328,239,523	4,893	16,062,299
8	Moldova	2,657,637	4,731	125,723
9	Slovenia	2,087,946	4,573	95,481
10	Panama	4,246,439	4,488	190,585
11	Switzerland	8,574,832	4,360	373,831
12	Croatia	4,067,500	4,241	172,523
13	Israel	9,053,300	3,930	355,786
14	Serbia	6,944,975	3,764	261,437
15	Spain	47,076,781	3,676	1,730,575
16	Austria	8,877,067	3,603	319,822
17	France	67,059,887	3,587	2,405,255
18	Netherlands	17,332,850	3,540	613,630
19	North Macedonia	2,083,459	3,505	73,025
20	Kuwait	4,207,083	3,471	146,044

 1top_10 = top_20_confirmed_df.head(10)
 2bg_color = 'rgba(208, 225, 242, 1.0)'
 3
 4fig = go.Figure(
 5    data = [go.Bar(x = list(top_10['Cases per 100k']),
 6                   y = list(top_10['Country']),
 7                   hovertemplate = [f"""<b>{r['Country']}</b>: Confirmed Cases<br>
 8<br>Per 100k: {r['Cases per 100k']:,.0F}
 9<br>Total: {r['Total cases']:,.0f}
10<br>Population: {r['Population']:,.0f}
11<extra></extra>
12""" for i,r in top_10.iterrows()],
13                   orientation = 'h')]
14)
15
16fig.update_layout(
17    # Set default font
18        font = dict(
19        family = "Droid Sans",
20        size = 16
21    ),
22    # Set figure title
23    title = dict(
24        text = f"""<b>Countries with highest confirmed cases of COVID-19 per 100,000
25<BR>{pd.to_datetime('2020-12-12 00:00:00').strftime('%b %d %Y')}</b>""",
26        xref = 'container',
27        yref = 'container',
28        x = 0.5,
29        y = 0.91,
30        xanchor = 'center',
31        yanchor = 'middle',
32        font = dict(family = 'Droid Sans', size = 24)
33    ),
34    # set x-axis
35    xaxis = dict(
36        title = dict(
37            text = 'Number of confirmed cases per 100,000 of population',
38            font = dict(family = 'Droid Sans', size = 18)
39        ),
40        showgrid = False,
41        linecolor = bg_color,
42        linewidth = 2,
43        showticklabels = True,
44    ),
45    # set y-axis
46    yaxis = dict(
47        showgrid = False,
48        linecolor = bg_color,
49        linewidth = 4,
50    ),
51    # set the plot bacground color
52    plot_bgcolor = bg_color,
53    # set the hover background color
54    hoverlabel = dict(
55        bgcolor = 'rgba(75, 152, 201, 0.2)',
56        font_size = 16
57    ),
58    paper_bgcolor = bg_color,
59)
60
61fig.update_traces(marker_color = 'rgb(75, 152, 201)')
62fig['layout']['yaxis']['autorange'] = "reversed"
63
64fig.show()

Highest confirmed cases of COVID-19 per 100,000 of population

 1top_10_confirmed_df = (confirmed_per_100k_df.sort_values(
 2    by = confirmed_per_100k_df.columns[-4], ascending = False)
 3    .head(10))
 4
 5top_10 = (top_10_confirmed_df
 6          .drop(['Country Name', 'Population'], axis = 1)
 7          .set_index('Covid_Country')
 8          .T)
 9
10bg_color = 'rgba(208, 225, 242, 1.0)'
11line_color = 'rgba(75, 152, 201, 1.0)'
12grid_color = 'rgba(75, 152, 201, 0.3)'
13
14fig = go.Figure()
15for c in top_10.columns:
16    fig.add_trace(go.Scatter(
17        x = top_10.index,
18        y = top_10[c],
19        name = c,
20        hovertemplate = [f"""<b>{c}</b><br>
21<br>Cases per 100k:<br>{r[c]:,.0F}
22<extra></extra>
23""" for i,r in top_10.iterrows()]
24    ))
25
26fig.update_traces(
27     hoverinfo = 'text+name',
28    mode = 'lines'
29)
30
31fig.update_layout(
32    # Set figure title
33    title = dict(
34        text = f"""<b>Highest confirmed cases of COVID-19 per 100,000</b>\
35<br>{top_10.index[-1].strftime('%b %d %Y')}""",
36        xref = 'container',
37        yref = 'container',
38        x = 0.5,
39        y = 0.9,
40        xanchor = 'center',
41        yanchor = 'middle',
42        font = dict(family = 'Droid Sans', size = 28)
43    ),
44    # set legend
45    legend = dict(
46        orientation = 'v',
47        traceorder = 'normal',
48        font_size = 12,
49        x = 0.05,
50        y = 1.0,
51        xanchor = "left",
52        yanchor = "top"
53    ),
54    # set x-axis
55    xaxis = dict(
56        title = 'Date',
57        linecolor = line_color,
58        linewidth = 2,
59        gridcolor = grid_color,
60        showticklabels = True,
61        ticks = 'outside',
62    ),
63    # set y-axis
64    yaxis = dict(
65        title = 'Cases of COVID-19 per 100.000 of population',
66        rangemode = 'tozero',
67        linecolor = line_color,
68        linewidth = 2,
69        gridcolor = grid_color,
70        showticklabels = True,
71        ticks = 'outside',
72    ),
73    # set the plot background color
74    plot_bgcolor = bg_color,
75    paper_bgcolor = bg_color,
76)
77
78fig.show()

Changes in countries with highest confirmed cases of COVID-19 per 100,000 of population

Countries with highest deaths per 100,000 of population

Similar table and charts are created for the Deaths from COVID-19 data.

Table: The top 20 countries with highest deaths per 100,000 from COVID-19 on November 12 2020

Rank	Country	Population	Deaths per 100k	Total deaths
1	Belgium	11,484,055	154.9	17,792
2	Peru	32,510,453	112.4	36,544
3	Italy	60,297,396	106.2	64,036
4	Spain	47,076,781	101.2	47,624
5	North Macedonia	2,083,459	100.6	2,096
6	Bosnia and Herzegovina	3,301,000	99.9	3,298
7	Slovenia	2,087,946	97.8	2,041
8	United Kingdom	66,834,405	95.9	64,123
9	Moldova	2,657,637	95.8	2,547
10	US	328,239,523	90.7	297,818
11	Argentina	44,938,712	90.5	40,668
12	Mexico	127,575,529	89.1	113,704
13	Czechia	10,669,709	88.6	9,450
14	France	67,059,887	86.0	57,671
15	Brazil	211,049,527	85.8	181,123
16	Chile	18,952,038	83.6	15,846
17	Armenia	2,957,731	83.2	2,462
18	Bulgaria	6,975,761	80.7	5,626
19	Ecuador	17,373,662	79.9	13,874
20	Panama	4,246,439	78.4	3,331

Highest deaths from COVID-19 per 100,000 of population

Changes in countries with highest deaths from COVID-19 per 100,000 of population

Conclusion

Calculations can be applied to a whole dataset easily with Pandas, such as converting the total numbers of COVID-19 cases to numbers relative to a standard number. The challenge here was getting the population, which is available from the World Data Bank, but the names for countries are inconsistent. Once these are mapped to the countries, the population is merged in for each country. The div and mul dataframe functions are used to adjust the numbers relative to 100,000 of the population. The countries with highest relative numbers are not necessarily the same countries with the highest absolute numbers.

Plotly is an open-source graphing library for Python that produces interactive charts.