top of page

Fate Of Covid-19 In Pakistan (The Truth vs Media Reports)

[ This Article is written on 1st May 2020. Updates will be made as data becomes available ]

Eugene Wigner, A Nobel Laureate in Physics and one of the first rate mathematician’s of the last century, once said; “The optimist regards the future as uncertain”. A pessimist on the contrary thinks of the future as certain. And as the adage we have all heard about the glass being half empty or half full, a realist is always concerned about drinking the water rather than being romantic about it.

Majority of reports and predictions on Covid-19 in Pakistan are either only guess work, copy-paste projections from other countries or a mixture of all these. Even some mathematical model which are being used are inappropriate for a community like ours.

From past week of sifting through papers to predict trajectories of epidemics, I have come across a class of models called ‘Phenomenological models’. Phenomenological models have  been exceptionally successful at predicting disease. I have fitted such a model, introduced statistical uncertainty in it and made 1000 simulations from it.  Following predictions are a result of these simulations.

Pakistan is not a country with near to abundant economic resources. Our health care system is very far from the top. We have a very little margin of error.  To overcome this crisis, we need to carefully, tread water between certainty and uncertainty and ask direct questions.

  1. Is it growing? Or Declining?

  2. How fast is it growing? Or Declining?

  3. How many cases can we hope to see in future?

  4. When will/should the lock-down end?

  5. When will this epidemic end?

  6. How bad is it going to be?

These questions may seem simple but they are in no way trivial. These are what concerns a common man and these are what is required to make national level policies. And there is no easy way around answering them, either. There is one tool we have that has proved to be “unreasonably effective”(A term also coined by Wigner) in understanding of the world around us and that is Mathematics. In past more than 4000 years of recorded history we have learned that the world behaves according to the rules of Mathematics. This was the good part. The bad part is that we often don’t know which mathematical model fits best to which situation. And given that Mathematics is abstract, underlying assumptions of the models often ‘de-tune’ with the real world and often from one instance to other. Hence, this requires careful analysis. Let’s try to answer the questions, in light of the model:

Is it growing? Or Declining?

r_distribution

95% Confidence Interval means if that if Pakistan were to infected 100 times with Covid-19, 95% of the times ‘r’ would fall in the ‘red’ region.


Simple Answer: It is growing.

Explanation: How do we know this? We know this because our equation has a variable ‘r’ which tells us so. If r > 0 it means that infection is growing. If r < 0, it means that infection is slowing doing. Our estimate of ‘r’ is roughly, between 0.52 to 0.67. This also matches with other reports of other countries. For example, a research study done recently on Covid-19 in South Korea shows an ‘r’ that roughly falls between 0.6 to 0.7 (Shim et. al 2020)

How fast is it growing? Or Declining?

Simple Answer: It is growing fast enough to be scary.

p_distribution

Remember, 1 is Ferrari. 0 is walking. 95% Confidence intervals mean the same as for ‘r’ above.


Explanation: This answer needs some explanation because we interpret speed in context. For example, if you ask me how fast can I reach my office from my home if I go there by bi-cycle. I can give you a number, say with the speed of 20 km/hr. But the number wouldn’t be useful until you don’t compare with anything. For example, I would say that 20km/hr is fast for me if I usually walk to my office. It would be too slow for me if I go in a Ferrari.

Just like this speed, we have a number for the growth. This number is ‘p’ in our equations. Our ‘p’ is roughly 0.7 and just like putting it in context, The Ferrari in the world of our ‘p’ drives with the speed of 1. This is an ‘exponential’ speed. And ‘p’ of 0 is just walking. So we are driving considerably fast. To put this in even more context, if we were driving with the speed of our ‘exponential Ferrari’ it would take us to reach from one point to double that point in a (log 2 ÷ r) time. i.e if our p was equal to 1. This means for our value of ‘r’ it would take 1.04 to 1.06 days for the current number of infections to double. Like if we have 20,000 total confirmed cases today, we will have 20,000 x 2 = 40000 in just one day or so. But we are far slower than this. We cannot calculate doubling time like this for our speed(because of non-linearity). This brings us to the next question.

How many cases can we hope to see in future?

Simple Answer: This needs no explanation. Making our projection 2 weeks ahead we can calculate that we will have 35000 cumulative cases by next 2 weeks.

calibration

When will/should the lock-down end?

Simple Answer: It should either continue or series of ban-lift strategies to be adopted.

Explanation: To answer this we will use a number called ‘Rt’.Rt or the reproductive number at a given time,  refers to the average number of people to whom an infected person passes on the virus in a population. It is the virus’s actual transmission rate at a given moment. It depends on the measures taken to control the epidemic — quarantine and isolation, lockdown, social distancing etc.

An Rt = 1 means that the epidemic is holding steady: For every person who is infected, another one becomes infected, and as the first one either recovers or dies, the second one replaces it; the size of the total pool of infected people remains the same. At a rate below 1, the epidemic will fade out. At Rt > 1, it will grow.

An Rt of 1 might also not be acceptable in some situation, it matters on the context.

RT

Unfortunately, Our Rt values have risen to date despite efforts to lockdown.


Our country should determine the real-time effective reproductive number it can accept given its own circumstances, this will go hand in hand with economic modelling and figuring out the number of new daily infections that our hospitals. For starters, this is remarkably low. Projecting our R(t) values we need strict lock down to curb this number. Either in continuum or series of ban-lift strategies.

When will this epidemic end?

Simple Answer:  When 50% to 70% of people will get infected by the virus

Explanation: There are only 3 ways for such an epidemic to end.

  1. We find a vaccine.

  2. Social distancing makes the growth rate drop and the infection is either contained and dies out.

  3. 50% to 70% of people get infected by the virus, providing heard immunity to rest of them and conferring immunity by recovery to those who recover.

We know that vaccine is out of sight for many months to come(at-least).

This brings us down to the 3rd option. Unfortunately, as divorced from optimism as it sounds, this seems to be the case for us(for now). 50% to 70% of our population will be affected. The important question now is how fast are we going to reach there. This is important because if we reach there fast, we will have more people needing specialized care in hospitals, which may overwhelm the current capacity of them, that may turn out the result worst than it should be. 

As an example, this is how we can make policies; If reported cases of a day is 1297, Rt of 3 means that each of them has a potential of infecting 3 people. This makes 3891 new cases, 5% of these would need an I.C.U,  this makes 779 cases. If we have only 300 available beds in I.C.U, what will happen to the rest?

We cannot predict the end time with this model yet. For that we will need more data in future.

Computational Model & Testing

Following are the details of the model, my code. As is with all models, parameters change and even new candidate models become promising as new data appears, I will be updating this model as the days progress.

To answer key questions, I have employed a simple model with only 2 parameters. Generalized Growth Model, is a differential equation has been used to study growth and epidemics and has shown great success, especially at the start of an epidemic.


Denoting p as the decelerating factor,

Given that p = 1 shows exponential growth and p approaching 1 shows constant growth. For intial condition

this has a solution

and for

the solution is


.

Which is our case as first reported were 2 cases, both had travel history positive.


import numpy as np
from scipy.optimize import least_squares
import scipy.optimize as opt
import pandas as pd
import matplotlib.pyplot as plt
from lmfit import Parameters, report_fit, minimize
#####################################################################
'''READING DATA'''
#####################################################################
COVID_dataset = pd.read_excel('C:/......Code/data-sir/COVID_Pakistan_25-April.xlsx')
#COVID_dataset.columns.vaues returns an array containing all column names
total_active = np.array(COVID_dataset["Total Active Cases"])
total_confirmed = np.array(COVID_dataset["Total Confirmed Cases"]) #has some missing vals
daily_cases = np.array(COVID_dataset["Daily New Cases"])
#
time = []
for i in range(len(daily_cases)+ 1):
time.append(i)
time.remove(0)
time = np.array(time, dtype= np.float64) #place holder for time
total_active = np.array(total_active )

Writing Growth function, Objective Function for optimization.


#############################################################################################
'''GROWTH AND OBJECTIVE FUNCTIONS'''
#############################################################################################
def g(t, r, v, i_0 = 2):
'''
Growth Function
----------------------------------
t = input data in the form of days
r = growth rate
v = 1 - p
0 =&amp;amp;lt; p =&amp;amp;lt; 1 is a constant which denotes exponential growth when p = 1,
constant growth when p = 0, sub-exponential growth otherwise.
i_0 is our intial condition which is not for use to us for now but was written
when city wise analysis will be done(in future). and intial cases for each will
be different.
'''
j = np.power(i_0,v)
return  np.power(j + (r*v*t), 1.0 / v)

def og(params, t, data, i_0 = 2):
'''
Objective Function
-------------------------------------------
For optimizing the growth function w.r.t Data and t.
params = parameters (r,v) will be optimized
'''
r = params['r'].value
v = params['v'].value

j = np.power(2,v)
model = np.power(j + (r*v*t), 1.0 / v)
return data - model

############################################################################################
'''OPTIMIZATION'''
############################################################################################
#defining parameters
############################################################################################
'''OPTIMIZATION'''
############################################################################################
#defining parameters
params = Parameters()
params.add('r', value= 1.0)
params.add('v', value= 0.74 , min = 0.0)#giving starting point to our parameters

# Optimizing Objective growth function w.r.t data using leastSquars()
out_g = minimize(og, params, args=(time, c_cases), nan_policy = 'propagate')

#fitting the model to data after parameters are optimized
fitted_model = g(time, out_g.params['r'].value, out_g.params['v'].value)
fitted_params = [out_g.params['r'].value, out_g.params['v'].value]
#############################################################################################
'''MOODEL DIAGNOSTICS'''
#############################################################################################
print(fit_report(out_g))

Running Diagnostics and visual inspection shows that model is a good fit on the given data

data_e_fittedModel

Making Bootstrap samples from the fitted model.


given that

are parameters of the model

Cumulative incidence from incidence is defined as


given that

poisson error structure is imputed

making bootstrap realizations of incidence and cumulative incidence


, respectively with separate parameters from the original fitted model. Distribution of parameters can be drawn from models fitted on these realizations and we will use this later in the code.

#########################################################################################
'''MAKE BOOTSTRAP'''
#############################################################################
def boot_sample(size, fitted_model_params, CASE = 'C' ):
    '''takes a bootstrap sample of a given size from a fitted model after imputing possion error

    size = length of the bootsrap sample required
    fitted_model_params = parameters from the fitted model (previously optimized)
    CASE = 'C' or 'I'. i.e If our model was fitted on Cumulative incidence vs. daily incidence data, respectively
    -----------------------------------------------------------------------------------
    return =&amp;amp;gt; bootstrap sample of type 'list'.
    '''

    #make x-axes of length 'size'.
    x_axes = []
    for i in range(size + 1):
        x_axes.append(i)
    x_axes.remove(0) #becuase our day first is '1' and not day '0'
    x_axes = np.array(x_axes, dtype = np.float64)

    #fit the model to our x_axes
    fitted = g(x_axes, fitted_model_params[0], fitted_model_params[1])

    #If I have modeled on incidence data I will first need to make a cumulative data from it to make poisson error structure
    if CASE == 'I':
        Ftj_modeled = []
        for i in range(len(fitted)):
            if i &amp;amp;lt; 1:
                Ftj_modeled.append(fitted[0])
            else:
                Ftj_modeled.append(sum(fitted[:i]))

    #meaning if we already have a cummulative data we will continue and make poisson error structure from it
    if CASE == 'C':
        Ftj_modeled = fitted

    #introduce poisson erros
    pois = []
    for i in range(len(Ftj_modeled)):
        if i &amp;amp;lt; 1:
            pois.append(np.random.poisson(Ftj_modeled[i] - 0))
        else:
            pois.append(np.random.poisson(Ftj_modeled[i] - Ftj_modeled[i-1]))

    #make data from the poisson erros
    boot_sample = []
    if CASE == 'C':
        for i in range(len(pois)):
            if i &amp;amp;lt; 1:
                boot_sample.append(pois[0])
            else:
                boot_sample.append(sum(pois[:i]))

    if CASE == 'I':
        boot_sample = pois

    return boot_sample

def boot_model_optimize(y_sample):
    '''takes a bootstrap sample and fits a curve to it
    y_sample = bootstrap sample
    ----------------------------------------------------
    1) returns fitted bootstrap model
    2) retruns fitted bootstap parameters
    '''

    #make x-axes of length 'size'.
    x_sample = []
    for i in range(len(y_sample) + 1):
        x_sample.append(i)
    x_sample.remove(0) #becuase our day first is '1' and not day '0'
    x_sample = np.array(x_sample, dtype = np.float64)

    boot_params = Parameters()
    boot_params.add('r', value= fitted_params[0])
    boot_params.add('v', value= fitted_params[1],  min=0.0)

    out_boot = minimize(og, boot_params, args=(x_sample, y_sample), nan_policy = 'propagate' )#for the LM algorithm
    fitted_boot_model = g(x_sample, out_boot.params['r'].value, out_boot.params['v'].value)

    #We will make the model and parameters a python 'list'. To 'maybe' ease turning the, into other data structures later.
    fitted_boot_model = np.array(fitted_boot_model).tolist()
    fitted_boot_params = [out_boot.params['r'].value, out_boot.params['v'].value]

    return fitted_boot_params, fitted_boot_model

#######################################################################################
def bootstrap(instances, size, fitted_model_params, case = 'C'):
    '''Makes a given number of bootstrap samples and fit curves to them.
    Returns Parameters and Models of the Bootstrapped data
    instances = number of bootstrap samples to create.
    size = size of each sample.
    fitted_model_params = parameters of model fitted on the original data.
    case = 'I' or 'C'; Whether the data is daily incidence or Cumulative incidence.
    --------------------------------------------------------------------
    1) retrun BOOT_SAMPLES = Nested list of bootstrap samples build off of model fitted on original data.
    2) return BOOT_MODELS = Nested list of model data built off of bootstrap samples are optimizing functions to these.
    3) return BOOT_PARAMS = Nested list of parameters of bootsrap models.
    '''
    BOOT_SAMPLES = []
    BOOT_MODELS = []
    BOOT_PARAMS = []
    for i in range(instances):
        Boot_Sample = boot_sample(size, fitted_model_params, case)
        Boot_Param, Boot_Model = boot_model_optimize(Boot_Sample)
        BOOT_SAMPLES.append(Boot_Sample)
        BOOT_MODELS.append(Boot_Model)
        BOOT_PARAMS.append(Boot_Param)
    return BOOT_SAMPLES, BOOT_PARAMS, BOOT_MODELS
Figure_1

R(t) are based on modlling the same GGM, on incidence data, model fits well and R(t)’s calculated appropriately.

def reff(it):
  sx = np.random.gamma(1.53502, 2.2801, len(it))
  tx = it
  tx.insert(0,1)
  tx.pop()
  itsx = it*sx
  SX = []
  for i in range(len(sx)):
    if i &amp;lt;span id="mce_SELREST_start" style="overflow: hidden; line-height: 0;" data-mce-style="overflow: hidden; line-height: 0;"&amp;gt;&amp;lt;/span&amp;gt;&amp;amp;lt; 1:
      SX.append(sx[0])
    else:
      SX.append(sum(sx[:i]))
  r = [x/y for x, y in zip(it, SX)]
  return r

def rearrange(data):
    '''
    Sorts a nested list in order
    '''
    arranged_data = []
    for i in range(len(data[0])):
        arranged_data.append([j[i] for j in data])
    return arranged_data

def CI_upper_nested(data, limit = 97.5):
    '''
    Returns upper Limit of CI of a nested list
    '''
    ci_model = []
    arranged_data = rearrange(data)
    for i in range(len(arranged_data)):
        ci_upper = np.percentile(arranged_data[i], limit)
        ci_model.append(ci_upper)
    return ci_model

def CI_lower_nested(data, limit = 2.5):
    '''
    Returns lower Limit of CI of a nested list
    '''
    ci_model = []
    arranged_data = rearrange(data)
    for i in range(len(arranged_data)):
        ci_upper = np.percentile(arranged_data[i], limit)
        ci_model.append(ci_upper)
    return ci_model

def CI_upper(data, limit = 97.5):
    '''
    Returns upper Limit of CI of a nested list
    '''
    ci_upper = np.percentile(data, limit)
    return ci_upper
############################################################################

3 views0 comments

Comments


bottom of page