# Some examples of bootstrapping and stochastic sampling methods in statistical analyses

In statistical analyses, the term “bootstrapping” refers to methodology that involves randomly sampling a population with replacement.  In this module, we’ll discuss some examples of bootstrapping and other stochastic sampling methods I frequently apply in my own analyses.

Visits: 1930

Visits: 1268

# Monte Carlo methods for assessing uncertainty in model estimates due to uncertainties in parameters

[This module presents Monte Carlo stochastic methods that can be used to assess uncertainty in model estimates due to uncertainty in one or more parameters]

Visits: 1239

# Agent based disease modelling with networked populations

[This module presents a simple introduction to agent based modelling in R of contagion spreading on a network]

Visits: 1310

Visits: 513

# Stochasticity due to sampling in biological or social science data

[In the Spring AML612 course at ASU, we have discussed stochastic modelling methods, including Markov Chain Monte Carlo, Stochastic Differential Equations, and agent based models.  Here we discuss how random sampling can contribute stochasticity to observed data]

Visits: 1245

# Combining stochastic modelling methods: MCMC and SDE’s

[In past modules, we have discussed Markov Chain Monte Carlo methods and Stochastic Differential Equations with Gaussian noise for stochastic modelling of compartmental models. In this module, we will describe how combing the two has the potential to simultaneously optimize computational efficiency and accuracy]

Visits: 1935

# Stochastic compartmental modelling with Stochastic Differential Equations

[In this module, students will become familiar with stochastic modelling with Stochastic Differential Equations for compartmental models and how they compare and contrast with Markov Chain Monte Carlo Methods.]

A good, but much more mathematical, introduction to the material discussed here can be found in the paper Construction of Equivalent Stochastic Differential Equation Models by Allen et al (2008).

Visits: 2989

# Stochastic compartmental modelling with Markov Chain Monte Carlo: Part II

[In this module, students will become familiar with Markov Chain Monte Carlo compartmental modelling by sampling from the Exponential distribution to simulate the time between state transitions.  Several examples will be given, including modelling of Aedes aegypti populations, the vector for Zika virus.]

In this past module, we discussed the Exponential and Poisson distributions.  In particular we noted the duality of these two distributions; the Exponential distribution describes the time between events in a Poisson process.

In this past module, we discussed how to perform simple Markov Chain Monte Carlo stochastic modelling of a compartmental model, by sampling at each small time step the change in compartments from the Poisson distribution.

Because of the duality of the Exponential and Poisson distributions we can just as easily perform MCMC stochastic compartmental modelling randomly Exponentially sampling the time between state transitions.

In order to do this, we make use of this special property of the Exponential distribution: If X~Exp(gamma) and Y~Exp(rho), then min(X,Y) is distributed as Exp(lambda) with lambda=gamma+rho
The proof of this property can be found here.

Thus, for example, given a model with two possible state transitions, with expected rate gamma and rho, respectively, we expect the rate for at least one of the transitions to take place to be gamma+rho.  Thus, we would randomly sample X~Exp(gamma) and Y~Exp(rho), and whichever is smallest is the state transition we allow to take place, and we increment the time by delta_t = min(X,Y).

Example: SIR model

Take, for example an SIR model:

The list of state transitions is:

Thus, infection is expected to occur at rate beta*S*I/N, and recovery is expected to occur at rate gamma*I.  Thus, we would randomly sample X~Exp(beta*S*I/N) and Y~Exp(gamma*I); if X is less than Y then an infection takes place, and we increment time by delta_t=X.  If Y is less than X, then a recovery takes place, and delta_t=Y.

The R script sir_markov_chain_iterated_exponential.R performs an MCMC stochastic simulation of this system, with options to sample from the Exponential distribution (lexponential=1 in the script), or the Poisson distribution (lexponential=0).  The script produces the plot:

Try running the script again with lexponential=0 to see how the plot changes.

Example: SEIR Model

The Susceptible, Exposed, Infected, Recovered model is described by the set of differential equations:

The table of possible transitions is

In this case, at each step of the simulation we randomly sample X~Exp(beta*S*I/N), Y~Exp(kappa*E), and W~Exp(gamma*I), and find the minimum, and select that as the transition that occurred.

The script seir_markov_chain_iterated_exponential.R performs 50 realisations of a MCMC stochastic SEIR model, sampling from the Exponential probability distribution to simulate the time between state transitions. The script produces the following plot:

Another example: Wolbachia in mosquito populations as a means to control Dengue, Chikungunya, and Zika virus outbreaks

Dengue virus is a mosquito borne arbovirus, primarily spread between humans by Aedes aegypti, a tropical/semi-tropical mosquito.  Because of the wide range of the mosquito across the world, approximately one third to half of the world’s population is at risk for DENV.

DENV typically causes a flu-like illness, but in some cases can cause haemorrhagic fever and death.  There are four serotypes of the virus, and infection with one will confer immunity to it, but not the other three.  Thus it is possible to get DENV several times.  There are currently no vaccines or specific treatment for DENV.

DENV does not spread directly from human-to-human.  Instead, a mosquito bites an infected person, and after a period of time (known as the extrinsic incubation period) the virus replicates in the mosquito’s gut, such that when it takes another blood meal it infects another person.

Chikungunya virus, CHIKV, is also spread Aedes aegypti mosquitoes.  Unlike DENV it only has one known serotype, so recovery from CHIKV confers lasting immunity.  However, new susceptible babies are always being born into the population, thus periodic outbreaks can occur when enough susceptibles build up, and an infected person is introduced to the population.  CHIKV rarely kills, but can cause crippling sequelae, including severe arthritis.  There is currently no vaccine or specific treatment for CHIKV.  In 2014/15 there was an explosive outbreak of CHIKV in the Americas; the entire population was susceptible at the beginning of the outbreak.

Zika virus is also spread by Aedes.  Like CHIKV, there is only one serotype, so recovery results in lasting immunity. There is currently no vaccine or specific treatment for Zika. Compared to DENV and CHIKV, Zika results in the most mild infection, and almost never kills.  In fact, it has been estimated that 80% of cases are asymptomatic.  However, there is recent evidence from Brazil and other countries that strongly suggests that infection with Zika virus early in pregnancy can cause significant birth defects in the fetus.  Thus, the most telling evidence of a large Zika outbreak is not the burden on the healthcare system of those who are actively infected, but the babies born with defects 6 to 9 months later.

It is for this reason that the WHO has declared a global health emergency due to Zika at the beginning of February, 2016.  Particularly worrisome is the fact that the 2016 summer Olympics are to be held in Brazil, meaning that there will likely be a huge influx of largely susceptible people to the region, and highly elevated probabilities of worldwide dispersal of the virus.  The only bright side is that the peak season for Aedes aegypti in Brazil is January to May...

It is worth noting here that Zika comes under the category of “neglected tropical diseases”, which are diseases primarily affecting low-income countries, and get very little research funding.  Zika has been so neglected, that very little study has been done of its epidemiology, including its incubation and infectious period, and the extrinsic incubation period in mosquitoes.

Currently, the only known means to control the spread of these viruses is to either prevent the mosquitoes from biting by wearing long sleeves or repellents (bed nets are not much use because Aedes primarily bite during the day), or to control the mosquito populations through the use of larvacides, insecticides, or removing breeding containers.  Almost complete eradication of Aedes aegypti in the Americas was achieved in the 1960’s with the latter methods by a man named Fred Soper.  Between the 1930’s to the 1960’s using DDT, and exhaustive house breeding-site searches, he managed to virtually eliminate Aedes aegypti throughout the Americas, and with it Dengue and Yellow Fever.

Aedes aegypti are “domestic” mosquitoes.  They had evolved along with humans, and prefer to breed in the nice clean water we store in tanks in our houses rather than in water in rivers or ponds.  We are also their favourite snack.  A tank of water can only hold so many larvae, however, because they are cannabilistic.  Perhaps largely due to this, the number of Aedes aegypti per household is usually quite modest… usually just a few per person. Note that only female mosquitoes bite… they need a blood meal in order to lay eggs.

It is believed that the spread of diseases like DENV, CHIKV and Zika is less due to dispersion of mosquitoes, and more due to the movement of humans between patches of mosquitoes.

As an aside, and related to dispersal distances; why do you think Aedes are more prevalent in cities?

A novel means of Aedes control involves Wolbachia bacteria. Wolbachia are bacteria that live within insect cells and are passed from one generation to the next through the insect’s eggs. Wolbachia is present in up to 60% of all the different species of insects around us including some mosquitoes that bite people, but not Aedes aegypti – the primary mosquito species involved in the transmission of DENV, CHIKV, and Zika.

We can produce Wolbachia-infected Aedes aegypti mosquitoes artificially in the laboratory (let’s refer to them as Wolbachia mosquitoes).  It turns out that Wolbachia mosquitoes have some interesting properties…. infection with Wolbachia impedes the the ability of DENV (and presumably CHIKV, and Zika) to replicate in the tissues of the mosquitoes.

Also, when a male mosquito infected with Wolbachia mates with a wild female mosquito, the wild female will not produce viable offspring.  When Wolbachia female mosquitoes mate with either Wolbachia males or wild males, they produce Wolbachia mosquitoes.

There is a hitch though; infection with Wolbachia appears to make Aedes mosquitoes live shorter lives. Also, it appears that their larvae aren’t quite as cannabilistic as wild larvaeAlso, their eggs are more likely to die if dessicated. This reduced fitness is almost certainly why we don’t see Wolbachia mosquitoes naturally in the wild.

However, the two very nice properties of Wolbachia mosquitoes (suppression of DENV and reduction in viable progeny of wild females) are so hopeful that the Bill and Melinda Gates Foundation has invested large amounts of money to examine the potential further.  The effect on wild populations of releases of all male Wolbachia mosquitoes have been studied, as well as the effect of releases of both male and female Wolbachia mosquitoes.  Note that the problem with releasing females is that even though they aren’t good transmitters of DENV, they still bite, and that annoys the populace.

Researchers in Australia did a test project where they did 10 releases, over 10 weeks, of male and female Wolbachia mosquitoes in the areas of about 1200 houses in Cairns, Australia.  Each week, approximately 20 to 25 mosquitoes per house were released.  Before the releases began, the researchers went door to door and got people to empty all sources of water in their yards.

As an aside here, Wolbachia isn’t always a good thing when it comes to human diseases; the disease filariasis is spread by a parastic worm that depend on Wolbachia to grow.  Controlling that disease requires reducing Wolbachia in the worms.

If we wish to use mathematical models to predict what will happen when Wolbachia mosquitoes are released, we need to take into account the inherently patchy nature of mosquito populations.  There may be, for instance, 50,000 houses in a city, with on average 10 wild Aedes per house.  But that doesn’t mean that every house has Aedes, and it certainly doesn’t mean that all 500,000 mosquitoes mix together and mate.  They are much more likely to mate with mosquitoes in near geographic proximity.  And things like roads, walls around houses, or industrial areas, can form boundaries of their patch that are not impossible to cross, but definitely represent barriers that impede their dispersal.

Because there are so few mosquitoes per house, and geographic patches like a city block or clusters of houses aren’t that large, stochastic models are appropriate. In the R file wolbach.R, I provide an example of a stochastic MCMC model for populations of mixed wild and Wolbachia mosquitoes.  The model I use has to take into account that there may be unequal numbers of males and females, particularly in the Wolbachia compartment, and that Wolbachia males and wild females cannot produce viable offspring.  The model I use in the script is (note the X’s are the wild mosquitoes, and the W’s are the Wolbachia mosquitoes):

Where A_Xf is the number of babies in the X compartment due to pairings of Xf, A_Xm is the number of babies in the X compartment due to pairings of Xm, and so on….
And where B_Xf is the number of babies in the W compartment due to pairings of Xf, B_Xm is the number of babies in the W compartment due to pairings of Xm, and so on….
Thus:

Where sigma is the fecundity of the X females, and rho is the fecundity of the W females.

A model similar to this one can be found here (warning: there are many typos in their model formulation).

In the wolbach.R script, I run the simulation for 90 days (about the length of a rainy season), and see how many X’s and W’s remain in the population.  With this model, we can examine the effect of releasing just male Wolbachia into the wild population, or a mix of male and female.  Running 100 realisations with an initial wild population of 100, and a Wolbachia population of 100 equally mixed between male and female, produces the following plot:

Questions: what are the assumptions of the model? How can we make it more realistic?

Visits: 4935

# Compartmental modelling with stochastic differential equations

[In this module, students will become familiar compartmental modelling with stochastic differential equations]

A good, but more mathematical, introduction to the material discussed here can be found in the paper Construction of Equivalent Stochastic Differential Equation Models by Allen et al (2008).

The content a previous module, formalism for preparing to do stochastic compartmental modelling, is imperative as a precursor to the content of this module.

Stochastic differential equation models can potentially be used for any dynamical system whose evolution can be described by a system of compartmental ODE’s, under certain provisos…

Visits: 1546

# Formalism for preparing to do stochastic compartmental modelling

[In this module, students will become familiar with a formalism for structuring information related to compartmental models. This formalism is very helpful when developing either Markov Chain Monte Carlo or Stochastic Differential Equations for stochastic simulation of the model]

A good, but much more mathematical, introduction to the material discussed here can be found in the paper Construction of Equivalent Stochastic Differential Equation Models by Allen et al (2008).

Visits: 1735

# Stochastic compartmental modelling with Markov Chain Monte Carlo: Part I

[In this module, students will become familiar with Markov Chain Monte Carlo, and what processes can and cannot be modelled with MCMC. Simple examples of SIR and SEIR disease models will be presented.]

Markov Chains are a random process by which a system transitions from one state to another within a state space.  They are memoryless, in the sense that the possible states the system can transition to at time t depend only on the current state of the system, not on how it got there. Continue reading

Visits: 5310

# Memoryless property of the exponential distribution

For a compartmental disease model, like the SIR model, the sojourn times in the compartments are inherently Exponentially distributed.  An interesting property of the Exponential distribution is that the probability of an individual leaving a compartment in a small time step is completely independent of the time the individual has spent in the compartment.

Visits: 3346

# Probability distributions important to modelling in the life and social sciences

[In this module, we discuss several probability distributions that are important to statistical, mathematical, and computational modeling in the life and social sciences.

Note that the names of probability distributions are proper nouns.  Thus, when you write about them, you should always capitalize them.]

Visits: 3118

# Difference between Markov Chain Monte Carlo, Stochastic Differential Equations, and Agent Based Models

[After reading this module, you will be aware of the limitations of deterministic epidemic models, like the SIR model, and understand when stochastic models are important.  You will be introduced to three different methods of stochastic modelling, and have a basic understanding of the appropriate applications of each.] Contents:

Visits: 3662

# How to give a good presentation

Throughout your academic career, from graduate studies onward, you will regularly need to give presentations of your work to various audiences. Everything from in-class presentations, conference presentations, to your thesis defense, seminars and colloquia.

I’ve sat through too many bad talks to count.  In fact, I would say that only one out of every four or five talks that I’ve sat through has been a presentation I would consider “good” (and not just “good” because the topic was directly related to my own research, but good because it was engaging and informative).

There are various qualities that make a “good” presentation, some of which are easy to quantify. Below I give a list of qualities that I’ve found tend to make good (or, conversely, bad) presentations.

Visits: 4967