Worldwide COVID-19 Analysis

2 minute read

COVID-19 Data Analysis

Introduction

This is an overview of a brief look that I had at the COVID-19 data as a way of showcasing some techniques I had learnt using python and qlikview.

Background

Towards the end of December 2019, Chinese public health authorities reported some cases of an acute respiratory disease in the Hubei province in Wuhan. They soon identified a novel coronavirus to be the main cause of these cases. The disease is now, worldwide, referred to as Coronavirus Disease 2019 (COVID-19), and the virus has been named as a severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Which is a new strain of coronavirus not identified in humans before, making it very dangerous. Current research suggests this originated from Bats, in a food market, in Wuhan City. source: link

As of writing this (23/03/20) there was 345290 confirmed cases, 14925 deaths and 99541 recovered. source: link

Contents

The goal of this project was to visualise the data surrounding the SARS-CoV-2 virus, whilst showcasing qlikview as a visualisation platform. As github pages is a static site, I will be publishing graphs and data that will be out of date, the date of when the data ends will be made clear.

  • Data
  • Figures
  • Summary

Data

I took 3 data files (csv) from here. This enabled me to track the location of the cases (by longitude and latitude) and also the number of deaths, cases and recovered patients. In order to get this data, I used python to fetch and clean the dataset and then save locally so accessing on qlikview would be quick and easy. The python script can run each day, updating when there has been a change to the source files. The code for the data cleaning can be found on my GitHub here.

Figures

Figure 1 - Accumulative summary of cases

This figure shows, how over time, the different types of case have developed (Confirmed, Death and Recovered). linearly separable data

Figure 2 - Accumulative cases in the UK

This figure shows over time, the different cases for UK. linearly separable data

Figure 3 - Animated Bubble chart world map distribution of cases

The figure shows the spread of cumulative cases up to the date of this post.

Sorry, at the moment this link will only work on chrome, i’m looking into fixing this

Summary

Even though there is a clear uprising trend shown by the data, the questions which are important to ask are:

  1. The confirmed cases is purely based off how many people are tested and for most countries, even if people are suspected to have the virus they are told to stay home unless their symptoms are worsened. This results in the actual number of confirmed cases being much higher than what we have plotted in any of the graphs online.
  2. There is information such as age of the case individual, which would give a lot of important information to us. However, this has not yet been made publicly accessible, to my knowledge.