Exploring data with Swift TabularData and DataFrame

Exploring data with Swift TabularData and DataFrame

TabularData was introduced in iOS 15 and can be used to load data from csv or json format into a DataFrame to explore and manipulate. This is not Python Pandas, but it does provide some basic functionality for exploring data and is fast and efficient. In this article, the data on the deaths from Covid is loaded, grouped and sorted to see the countries with the highest death toll from Covid.



Load data into a Swift DataFrame from a CSV file

The csv file on the deaths from Covid-19 will be used to work with the Swift DataFrame. As shown previously in Countries with highest confirmed cases of Covid-19 with Plotly and Countries with highest deaths from covid the data on the global deaths from Covid-19 is available from Johns Hopkins University - who have made the data available on GitHub. More information about COVID-19 and the coronavirus is available from Coronavirus disease (COVID-19) advice for the public.

Swift DataFrame can load the latest data on deaths from Covid-19 directly from the csv file in github. DataFrame provides an initialisation to create the DataFame with the contents of a csv file - contentsOfCSVFile.

 1import TabularData
 2import Foundation
 3
 4let deaths_path = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv"
 5
 6let options = CSVReadingOptions(hasHeaderRow: true, delimiter: ",")
 7guard let fileUrl = URL(string: deaths_path) else {
 8    fatalError("Error creating Url")
 9}
10
11var covidDeathsDf = try! DataFrame(
12    contentsOfCSVFile: fileUrl,
13    options: options)
14
15print("\(covidDeathsDf)")

Loading data into a Swift DataFrame from a csv file



Use FormattingOptions to see more or less data

Use FormattingOptions to configure how to display the TabularData.

1// Use FormattingOptions to see more or less data
2let formatingOptions = FormattingOptions(maximumLineWidth: 150,
3                                         maximumCellWidth: 10,
4                                         maximumRowCount: 30,
5                                         includesColumnTypes: false)
6print(covidDeathsDf.description(options: formatingOptions))

Use FormattingOptions to print out more or less data from the DataFrame



List out all of the Column Names

It can be useful to list the column names from the loaded DataFrame.

  1print(covidDeathsDf.columns.map { col in col.name } )
  2
  3["Province/State", "Country/Region", "Lat", "Long", "1/22/20", "1/23/20", "1/24/20",
  4"1/25/20", "1/26/20", "1/27/20", "1/28/20", "1/29/20", "1/30/20", "1/31/20",
  5"2/1/20", "2/2/20", "2/3/20", "2/4/20", "2/5/20", "2/6/20", "2/7/20", "2/8/20",
  6"2/9/20", "2/10/20", "2/11/20", "2/12/20", "2/13/20", "2/14/20", "2/15/20",
  7"2/16/20", "2/17/20", "2/18/20", "2/19/20", "2/20/20", "2/21/20", "2/22/20",
  8"2/23/20", "2/24/20", "2/25/20", "2/26/20", "2/27/20", "2/28/20", "2/29/20",
  9"3/1/20", "3/2/20", "3/3/20", "3/4/20", "3/5/20", "3/6/20", "3/7/20", "3/8/20",
 10"3/9/20", "3/10/20", "3/11/20", "3/12/20", "3/13/20", "3/14/20", "3/15/20",
 11"3/16/20", "3/17/20", "3/18/20", "3/19/20", "3/20/20", "3/21/20", "3/22/20",
 12"3/23/20", "3/24/20", "3/25/20", "3/26/20", "3/27/20", "3/28/20", "3/29/20",
 13"3/30/20", "3/31/20", "4/1/20", "4/2/20", "4/3/20", "4/4/20", "4/5/20", "4/6/20",
 14"4/7/20", "4/8/20", "4/9/20", "4/10/20", "4/11/20", "4/12/20", "4/13/20", "4/14/20",
 15"4/15/20", "4/16/20", "4/17/20", "4/18/20", "4/19/20", "4/20/20", "4/21/20",
 16"4/22/20", "4/23/20", "4/24/20", "4/25/20", "4/26/20", "4/27/20", "4/28/20",
 17"4/29/20", "4/30/20", "5/1/20", "5/2/20", "5/3/20", "5/4/20", "5/5/20", "5/6/20",
 18"5/7/20", "5/8/20", "5/9/20", "5/10/20", "5/11/20", "5/12/20", "5/13/20", "5/14/20",
 19"5/15/20", "5/16/20", "5/17/20", "5/18/20", "5/19/20", "5/20/20", "5/21/20",
 20"5/22/20", "5/23/20", "5/24/20", "5/25/20", "5/26/20", "5/27/20", "5/28/20",
 21"5/29/20", "5/30/20", "5/31/20", "6/1/20", "6/2/20", "6/3/20", "6/4/20", "6/5/20",
 22"6/6/20", "6/7/20", "6/8/20", "6/9/20", "6/10/20", "6/11/20", "6/12/20", "6/13/20",
 23"6/14/20", "6/15/20", "6/16/20", "6/17/20", "6/18/20", "6/19/20", "6/20/20",
 24"6/21/20", "6/22/20", "6/23/20", "6/24/20", "6/25/20", "6/26/20", "6/27/20",
 25"6/28/20", "6/29/20", "6/30/20", "7/1/20", "7/2/20", "7/3/20", "7/4/20", "7/5/20",
 26"7/6/20", "7/7/20", "7/8/20", "7/9/20", "7/10/20", "7/11/20", "7/12/20", "7/13/20",
 27"7/14/20", "7/15/20", "7/16/20", "7/17/20", "7/18/20", "7/19/20", "7/20/20",
 28"7/21/20", "7/22/20", "7/23/20", "7/24/20", "7/25/20", "7/26/20", "7/27/20",
 29"7/28/20", "7/29/20", "7/30/20", "7/31/20", "8/1/20", "8/2/20", "8/3/20", "8/4/20",
 30"8/5/20", "8/6/20", "8/7/20", "8/8/20", "8/9/20", "8/10/20", "8/11/20", "8/12/20",
 31"8/13/20", "8/14/20", "8/15/20", "8/16/20", "8/17/20", "8/18/20", "8/19/20",
 32"8/20/20", "8/21/20", "8/22/20", "8/23/20", "8/24/20", "8/25/20", "8/26/20",
 33"8/27/20", "8/28/20", "8/29/20", "8/30/20", "8/31/20", "9/1/20", "9/2/20", "9/3/20",
 34"9/4/20", "9/5/20", "9/6/20", "9/7/20", "9/8/20", "9/9/20", "9/10/20", "9/11/20",
 35"9/12/20", "9/13/20", "9/14/20", "9/15/20", "9/16/20", "9/17/20", "9/18/20",
 36"9/19/20", "9/20/20", "9/21/20", "9/22/20", "9/23/20", "9/24/20", "9/25/20",
 37"9/26/20", "9/27/20", "9/28/20", "9/29/20", "9/30/20", "10/1/20", "10/2/20",
 38"10/3/20", "10/4/20", "10/5/20", "10/6/20", "10/7/20", "10/8/20", "10/9/20",
 39"10/10/20", "10/11/20", "10/12/20", "10/13/20", "10/14/20", "10/15/20", "10/16/20",
 40"10/17/20", "10/18/20", "10/19/20", "10/20/20", "10/21/20", "10/22/20", "10/23/20",
 41"10/24/20", "10/25/20", "10/26/20", "10/27/20", "10/28/20", "10/29/20", "10/30/20",
 42"10/31/20", "11/1/20", "11/2/20", "11/3/20", "11/4/20", "11/5/20", "11/6/20",
 43"11/7/20", "11/8/20", "11/9/20", "11/10/20", "11/11/20", "11/12/20", "11/13/20",
 44"11/14/20", "11/15/20", "11/16/20", "11/17/20", "11/18/20", "11/19/20", "11/20/20",
 45"11/21/20", "11/22/20", "11/23/20", "11/24/20", "11/25/20", "11/26/20", "11/27/20",
 46"11/28/20", "11/29/20", "11/30/20", "12/1/20", "12/2/20", "12/3/20", "12/4/20",
 47"12/5/20", "12/6/20", "12/7/20", "12/8/20", "12/9/20", "12/10/20", "12/11/20",
 48"12/12/20", "12/13/20", "12/14/20", "12/15/20", "12/16/20", "12/17/20", "12/18/20",
 49"12/19/20", "12/20/20", "12/21/20", "12/22/20", "12/23/20", "12/24/20", "12/25/20",
 50"12/26/20", "12/27/20", "12/28/20", "12/29/20", "12/30/20", "12/31/20", "1/1/21",
 51"1/2/21", "1/3/21", "1/4/21", "1/5/21", "1/6/21", "1/7/21", "1/8/21", "1/9/21",
 52"1/10/21", "1/11/21", "1/12/21", "1/13/21", "1/14/21", "1/15/21", "1/16/21",
 53"1/17/21", "1/18/21", "1/19/21", "1/20/21", "1/21/21", "1/22/21", "1/23/21",
 54"1/24/21", "1/25/21", "1/26/21", "1/27/21", "1/28/21", "1/29/21", "1/30/21",
 55"1/31/21", "2/1/21", "2/2/21", "2/3/21", "2/4/21", "2/5/21", "2/6/21", "2/7/21",
 56"2/8/21", "2/9/21", "2/10/21", "2/11/21", "2/12/21", "2/13/21", "2/14/21", "2/15/21",
 57"2/16/21", "2/17/21", "2/18/21", "2/19/21", "2/20/21", "2/21/21", "2/22/21",
 58"2/23/21", "2/24/21", "2/25/21", "2/26/21", "2/27/21", "2/28/21", "3/1/21", "3/2/21",
 59"3/3/21", "3/4/21", "3/5/21", "3/6/21", "3/7/21", "3/8/21", "3/9/21", "3/10/21",
 60"3/11/21", "3/12/21", "3/13/21", "3/14/21", "3/15/21", "3/16/21", "3/17/21",
 61"3/18/21", "3/19/21", "3/20/21", "3/21/21", "3/22/21", "3/23/21", "3/24/21",
 62"3/25/21", "3/26/21", "3/27/21", "3/28/21", "3/29/21", "3/30/21", "3/31/21",
 63"4/1/21", "4/2/21", "4/3/21", "4/4/21", "4/5/21", "4/6/21", "4/7/21", "4/8/21",
 64"4/9/21", "4/10/21", "4/11/21", "4/12/21", "4/13/21", "4/14/21", "4/15/21",
 65"4/16/21", "4/17/21", "4/18/21", "4/19/21", "4/20/21", "4/21/21", "4/22/21",
 66"4/23/21", "4/24/21", "4/25/21", "4/26/21", "4/27/21", "4/28/21", "4/29/21",
 67"4/30/21", "5/1/21", "5/2/21", "5/3/21", "5/4/21", "5/5/21", "5/6/21", "5/7/21",
 68"5/8/21", "5/9/21", "5/10/21", "5/11/21", "5/12/21", "5/13/21", "5/14/21", "5/15/21",
 69"5/16/21", "5/17/21", "5/18/21", "5/19/21", "5/20/21", "5/21/21", "5/22/21",
 70"5/23/21", "5/24/21", "5/25/21", "5/26/21", "5/27/21", "5/28/21", "5/29/21",
 71"5/30/21", "5/31/21", "6/1/21", "6/2/21", "6/3/21", "6/4/21", "6/5/21", "6/6/21",
 72"6/7/21", "6/8/21", "6/9/21", "6/10/21", "6/11/21", "6/12/21", "6/13/21", "6/14/21",
 73"6/15/21", "6/16/21", "6/17/21", "6/18/21", "6/19/21", "6/20/21", "6/21/21",
 74"6/22/21", "6/23/21", "6/24/21", "6/25/21", "6/26/21", "6/27/21", "6/28/21",
 75"6/29/21", "6/30/21", "7/1/21", "7/2/21", "7/3/21", "7/4/21", "7/5/21", "7/6/21",
 76"7/7/21", "7/8/21", "7/9/21", "7/10/21", "7/11/21", "7/12/21", "7/13/21", "7/14/21",
 77"7/15/21", "7/16/21", "7/17/21", "7/18/21", "7/19/21", "7/20/21", "7/21/21",
 78"7/22/21", "7/23/21", "7/24/21", "7/25/21", "7/26/21", "7/27/21", "7/28/21",
 79"7/29/21", "7/30/21", "7/31/21", "8/1/21", "8/2/21", "8/3/21", "8/4/21", "8/5/21",
 80"8/6/21", "8/7/21", "8/8/21", "8/9/21", "8/10/21", "8/11/21", "8/12/21", "8/13/21",
 81"8/14/21", "8/15/21", "8/16/21", "8/17/21", "8/18/21", "8/19/21", "8/20/21",
 82"8/21/21", "8/22/21", "8/23/21", "8/24/21", "8/25/21", "8/26/21", "8/27/21",
 83"8/28/21", "8/29/21", "8/30/21", "8/31/21", "9/1/21", "9/2/21", "9/3/21", "9/4/21",
 84"9/5/21", "9/6/21", "9/7/21", "9/8/21", "9/9/21", "9/10/21", "9/11/21", "9/12/21",
 85"9/13/21", "9/14/21", "9/15/21", "9/16/21", "9/17/21", "9/18/21", "9/19/21",
 86"9/20/21", "9/21/21", "9/22/21", "9/23/21", "9/24/21", "9/25/21", "9/26/21",
 87"9/27/21", "9/28/21", "9/29/21", "9/30/21", "10/1/21", "10/2/21", "10/3/21",
 88"10/4/21", "10/5/21", "10/6/21", "10/7/21", "10/8/21", "10/9/21", "10/10/21",
 89"10/11/21", "10/12/21", "10/13/21", "10/14/21", "10/15/21", "10/16/21", "10/17/21",
 90"10/18/21", "10/19/21", "10/20/21", "10/21/21", "10/22/21", "10/23/21", "10/24/21",
 91"10/25/21", "10/26/21", "10/27/21", "10/28/21", "10/29/21", "10/30/21", "10/31/21",
 92"11/1/21", "11/2/21", "11/3/21", "11/4/21", "11/5/21", "11/6/21", "11/7/21",
 93"11/8/21", "11/9/21", "11/10/21", "11/11/21", "11/12/21", "11/13/21", "11/14/21",
 94"11/15/21", "11/16/21", "11/17/21", "11/18/21", "11/19/21", "11/20/21", "11/21/21",
 95"11/22/21", "11/23/21", "11/24/21", "11/25/21", "11/26/21", "11/27/21", "11/28/21",
 96"11/29/21", "11/30/21", "12/1/21", "12/2/21", "12/3/21", "12/4/21", "12/5/21",
 97"12/6/21", "12/7/21", "12/8/21", "12/9/21", "12/10/21", "12/11/21", "12/12/21",
 98"12/13/21", "12/14/21", "12/15/21", "12/16/21", "12/17/21", "12/18/21", "12/19/21",
 99"12/20/21", "12/21/21", "12/22/21", "12/23/21", "12/24/21", "12/25/21", "12/26/21",
100"12/27/21", "12/28/21", "12/29/21", "12/30/21", "12/31/21", "1/1/22", "1/2/22",
101"1/3/22", "1/4/22", "1/5/22", "1/6/22", "1/7/22", "1/8/22", "1/9/22", "1/10/22",
102"1/11/22", "1/12/22", "1/13/22", "1/14/22", "1/15/22", "1/16/22", "1/17/22",
103"1/18/22", "1/19/22", "1/20/22", "1/21/22", "1/22/22", "1/23/22", "1/24/22",
104"1/25/22", "1/26/22", "1/27/22", "1/28/22", "1/29/22", "1/30/22", "1/31/22",
105"2/1/22", "2/2/22", "2/3/22", "2/4/22", "2/5/22", "2/6/22", "2/7/22", "2/8/22",
106"2/9/22", "2/10/22", "2/11/22", "2/12/22", "2/13/22", "2/14/22", "2/15/22",
107"2/16/22", "2/17/22", "2/18/22", "2/19/22", "2/20/22", "2/21/22", "2/22/22",
108"2/23/22", "2/24/22", "2/25/22", "2/26/22", "2/27/22", "2/28/22", "3/1/22", "3/2/22",
109"3/3/22", "3/4/22", "3/5/22", "3/6/22", "3/7/22", "3/8/22", "3/9/22", "3/10/22",
110"3/11/22", "3/12/22", "3/13/22", "3/14/22", "3/15/22", "3/16/22", "3/17/22",
111"3/18/22", "3/19/22", "3/20/22", "3/21/22", "3/22/22", "3/23/22", "3/24/22",
112"3/25/22", "3/26/22", "3/27/22", "3/28/22", "3/29/22", "3/30/22", "3/31/22",
113"4/1/22", "4/2/22", "4/3/22", "4/4/22", "4/5/22", "4/6/22", "4/7/22", "4/8/22",
114"4/9/22", "4/10/22", "4/11/22", "4/12/22", "4/13/22", "4/14/22", "4/15/22",
115"4/16/22", "4/17/22", "4/18/22", "4/19/22", "4/20/22", "4/21/22", "4/22/22",
116"4/23/22", "4/24/22", "4/25/22", "4/26/22", "4/27/22", "4/28/22", "4/29/22",
117"4/30/22"]
118


Load selected columns

Not all of the data in the csv file needs to be loaded. This is very useful when the data can contain hundreds of thousands of rows and columns, which can take some time to load. A lot of exploration and experimentation can be done quickly on a subset of the data. The column names can be specified using an array of strings and the rows can be be specified by row index. This code specifies the most recent columns and just loads the first 15 rows of data.

 1import TabularData
 2import Foundation
 3
 4let deaths_path = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv"
 5
 6guard let fileUrl = URL(string: deaths_path) else {
 7    fatalError("Error creating Url")
 8}
 9
10let columnNames = ["Province/State",
11                   "Country/Region",
12                   "Lat",
13                   "Long",
14                   "4/25/22",
15                   "4/26/22",
16                   "4/27/22",
17                   "4/28/22",
18                   "4/29/22",
19                   "4/30/22"]
20
21let options = CSVReadingOptions(hasHeaderRow: true, delimiter: ",")
22let covidDeathsDf = try! DataFrame(
23    contentsOfCSVFile: fileUrl,
24    columns: columnNames,
25    rows: 0..<20,
26    options: options)
27
28print("\(covidDeathsDf)")

The data type can also be specified for each column when loading data from a csv file. This has the advantage of throwing an error if the data does not match the expected type.

 1let columnNames = ["Province/State",
 2                   "Country/Region",
 3                   "Lat",
 4                   "Long",
 5                   "4/25/22",
 6                   "4/26/22",
 7                   "4/27/22",
 8                   "4/28/22",
 9                   "4/29/22",
10                   "4/30/22"]
11
12// Specify types for the data in the columns
13let columnTypes: [String : CSVType] = ["Province/State": .string,
14                                       "Country/Region": .string,
15                                       "Lat": .double,
16                                       "Long": .double,
17                                       "4/25/22": .integer,
18                                       "4/26/22": .integer,
19                                       "4/27/22": .integer,
20                                       "4/28/22": .integer,
21                                       "4/29/22": .integer,
22                                       "4/30/22": .integer]
23
24let options = CSVReadingOptions(hasHeaderRow: true, delimiter: ",")
25let covidDeathsDf =  try! DataFrame(
26    contentsOfCSVFile: fileUrl,
27    columns: columnNames,
28    rows: 0..<20,
29    types: columnTypes,
30    options: options)
31
32print("\(covidDeathsDf)")
33
34
35"""
36┏━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┓
37┃    ┃ Province/State               ┃ Country/Region      ┃ Lat       ┃ Long       ┃ 4/25/22 ┃ 4/26/22 ┃ 4/27/22 ┃ 4/28/22 ┃ 4/29/22 ┃ 4/30/22 ┃
38┃    ┃ <String>                     ┃ <String>            ┃ <Double>  ┃ <Double>   ┃ <Int>   ┃ <Int>   ┃ <Int>   ┃ <Int>   ┃ <Int>   ┃ <Int>   ┃
39┡━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━┩
40│ 0  │ nil                          │ Afghanistan         │  33.93911 │  67.709953 │   7,683 │   7,683 │   7,683 │   7,683 │   7,683 │   7,683 │
41│ 1  │ nil                          │ Albania             │   41.1533 │    20.1683 │   3,496 │   3,496 │   3,496 │   3,496 │   3,496 │   3,496 │
42│ 2  │ nil                          │ Algeria             │   28.0339 │     1.6596 │   6,875 │   6,875 │   6,875 │   6,875 │   6,875 │   6,875 │
43│ 3  │ nil                          │ Andorra             │   42.5063 │     1.5218 │     153 │     153 │     153 │     153 │     153 │     153 │
44│ 4  │ nil                          │ Angola              │  -11.2027 │    17.8739 │   1,900 │   1,900 │   1,900 │   1,900 │   1,900 │   1,900 │
45│ 5  │ nil                          │ Antarctica          │  -71.9499 │     23.347 │       0 │       0 │       0 │       0 │       0 │       0 │
46│ 6  │ nil                          │ Antigua and Barbuda │   17.0608 │   -61.7964 │     135 │     135 │     135 │     136 │     136 │     136 │
47│ 7  │ nil                          │ Argentina           │  -38.4161 │   -63.6167 │ 128,542 │ 128,542 │ 128,542 │ 128,542 │ 128,542 │ 128,542 │
48│ 8  │ nil                          │ Armenia             │   40.0691 │    45.0382 │   8,622 │   8,622 │   8,622 │   8,622 │   8,622 │   8,622 │
49│ 9  │ Australian Capital Territory │ Australia           │  -35.4735 │   149.0124 │      50 │      50 │      50 │      50 │      50 │      51 │
50│ 10 │ New South Wales              │ Australia           │  -33.8688 │   151.2093 │   2,714 │   2,724 │   2,743 │   2,749 │   2,769 │   2,774 │
51│ 11 │ Northern Territory           │ Australia           │  -12.4634 │   130.8456 │      40 │      40 │      42 │      43 │      43 │      43 │
52│ 12 │ Queensland                   │ Australia           │  -27.4698 │   153.0251 │     841 │     850 │     857 │     863 │     863 │     868 │
53│ 13 │ South Australia              │ Australia           │  -34.9285 │   138.6007 │     332 │     332 │     339 │     342 │     346 │     348 │
54│ 14 │ Tasmania                     │ Australia           │  -42.8821 │   147.3272 │      48 │      49 │      49 │      50 │      50 │      51 │
55│ 15 │ Victoria                     │ Australia           │  -37.8136 │   144.9631 │   2,902 │   2,932 │   2,942 │   2,948 │   2,967 │   2,967 │
56│ 16 │ Western Australia            │ Australia           │  -31.9505 │   115.8605 │     125 │     135 │     142 │     142 │     148 │     148 │
57│ 17 │ nil                          │ Austria             │   47.5162 │    14.5501 │  18,054 │  18,076 │  18,100 │  18,117 │  18,138 │  18,156 │
58│ 18 │ nil                          │ Azerbaijan          │   40.1431 │    47.5769 │   9,707 │   9,707 │   9,707 │   9,707 │   9,709 │   9,709 │
59│ 19 │ nil                          │ Bahamas             │ 25.025885 │ -78.035889 │     789 │     789 │     789 │     789 │     789 │     798 │
60└────┴──────────────────────────────┴─────────────────────┴───────────┴────────────┴─────────┴─────────┴─────────┴─────────┴─────────┴─────────┘
6120 rows, 10 columns
62"""

Use array of strings to specify the columns to load and array of numbers to limit to first 20 rows



Group data by a row

The data for Covid-19 contains fields for Country and State and it can be seen that there are many entries for some countries such as Australia. It is possible to combine all the data by a certain column using the grouped(by:) method. The following code groups the data by country and sums up the values for the latest column in the dataframe.

 1import TabularData
 2import Foundation
 3
 4let deaths_path = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv"
 5
 6guard let fileUrl = URL(string: deaths_path) else {
 7    fatalError("Error creating Url")
 8}
 9
10let columnNames = ["Country/Region",
11                   "4/30/22"]
12
13let columnTypes: [String : CSVType] = ["Country/Region": .string,
14                                       "4/30/22": .integer]
15
16let options = CSVReadingOptions(hasHeaderRow: true, delimiter: ",")
17let covidDeathsDf =  try! DataFrame(
18    contentsOfCSVFile: fileUrl,
19    columns: columnNames,
20    types: columnTypes,
21    options: options)
22
23print("\(covidDeathsDf)")
24
25let countryTotals = covidDeathsDf
26    .grouped(by: "Country/Region")
27    .sums("4/30/22", Int.self, order: .descending)
28
29print("\(countryTotals)")
30
31"""
32┏━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
33┃    ┃ Country/Region ┃ sum(4/30/22) ┃
34┃    ┃ <String>       ┃ <Int>        ┃
35┡━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
36│ 0  │ US             │      993,712 │
37│ 1  │ Brazil         │      663,736 │
38│ 2  │ India          │      523,843 │
39│ 3  │ Russia         │      368,319 │
40│ 4  │ Mexico         │      324,334 │
41│ 5  │ Peru           │      212,810 │
42│ 6  │ United Kingdom │      175,552 │
43│ 7  │ Italy          │      163,507 │
44│ 8  │ Indonesia      │      156,257 │
45│ 9  │ France         │      146,967 │
46│ 10 │ Iran           │      141,083 │
47│ 11 │ Colombia       │      139,797 │
48│ 12 │ Germany        │      135,461 │
49│ 13 │ Argentina      │      128,542 │
50│ 14 │ Poland         │      116,059 │
51│ 15 │ Ukraine        │      112,459 │
52│ 16 │ Spain          │      104,456 │
53│ 17 │ South Africa   │      100,363 │
54│ 18 │ Turkey         │       98,771 │
55│ 19 │ Romania        │       65,486 │
56┢╍╍╍╍┷╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍┷╍╍╍╍╍╍╍╍╍╍╍╍╍╍┪
57┇ ...                                ┇
58┗╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍┛
59198 rows, 2 columns
60"""

Grouping data by country and getting the total for each country



Rename columns in the Dataframe

Finally, it is possible to rename the column names in the dataframe.

 1import TabularData
 2import Foundation
 3
 4let deaths_path = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv"
 5
 6guard let fileUrl = URL(string: deaths_path) else {
 7    fatalError("Error creating Url")
 8}
 9
10let columnNames = ["Country/Region",
11                   "4/30/22"]
12
13let columnTypes: [String : CSVType] = ["Country/Region": .string,
14                                       "4/30/22": .integer]
15
16let options = CSVReadingOptions(hasHeaderRow: true, delimiter: ",")
17let covidDeathsDf =  try! DataFrame(
18    contentsOfCSVFile: fileUrl,
19    columns: columnNames,
20    types: columnTypes,
21    options: options)
22
23var countryTotals = covidDeathsDf
24    .grouped(by: "Country/Region")
25    .sums("4/30/22", Int.self, order: .descending)
26
27countryTotals.renameColumn("Country/Region", to: "Country")
28countryTotals.renameColumn("sum(4/30/22)", to: "Total Deaths")
29
30let formatingOptions = FormattingOptions(maximumLineWidth: 100, includesColumnTypes: false)
31print("\(countryTotals.prefix(20).description(options: formatingOptions))")

Grouping data by country and displaying countries with the highest deaths from Covid-19


Countries with the highest number of deaths from Covid-19 as of April 30 2022

Country Total Deaths
0 US 993,712
1 Brazil 663,736
2 India 523,843
3 Russia 368,319
4 Mexico 324,334
5 Peru 212,810
6 United Kingdom 175,552
7 Italy 163,507
8 Indonesia 156,257
9 France 146,967
10 Iran 141,083
11 Colombia 139,797
12 Germany 135,461
13 Argentina 128,542
14 Poland 116,059
15 Ukraine 112,459
16 Spain 104,456
17 South Africa 100,363
18 Turkey 98,771
19 Romania 65,486



Conclusion

The TabularData framework provides functionality to easily load data into table format and manipulate the data to provide useful information from the raw data. This article loaded data from a csv file containing daily global deaths from Covid-19 for each country. The DataFrame is great for exploring data and it is easy to group, filter and sort data. The documentation on TabularData is rather sparse with function definitions and it would benefit from more (or any) sample code and a more detailed explanation of the available functionality.