**[In this module, students will become familiar with time series analysis methods, including lagged regression methods, Fourier spectral analysis, harmonic linear regression, and Lomb-Scargle spectral analysis]**

# Category Archives: R code

# Review of Probability Distributions, Basic Statistics, and Hypothesis Testing

**[In this module, students will learn about probability distributions important to statistical modelling, focussing primarily on probability distributions that underlie the stochasticity in time series data.**

**In addition, in this course we will be learning how to formulate figure-of-merit statistics that can help to answer research questions like “Is quantity A significantly greater/less than quantity B?”, or “Does quantity X appear to be significantly related to quantity Y?”. As we are about to discuss, statistics that can be used to answer these types of questions do so using the underlying probability distribution to the statistic. Every statistic used for hypothesis testing has an underlying probability distribution.]**

# The difference between mathematical and statistical modelling (plus some more basics of R)

**[In this module, we will discuss the difference between mathematical and statistical modelling, using pandemic influenza as an example. Example R code that solves the differential equations of a compartmental SIR model with seasonal transmission (ie; a mathematical model) is presented. Also provided are an example of how to download add-on library packages in R, plus more examples of reading data sets into R, and aggregating the data sets by some quantity (in this case, a time series of influenza data in Geneva in 1918, aggregated by weekday).**

**Delving into how to write the R code to solve systems of ODE’s related to a compartmental mathematical model is perhaps slightly off the topic of a statistical modelling course, but worthwhile to examine; as mathematical and computational modellers, usually your aim in performing statistical analyses will be to uncover potential relationships that can be included in a mathematical model to make that model better describe the underlying dynamics of the system]**

- Introduction
- Susceptible Infected Recovered (SIR) influenza model with periodic transmission rate
- The time-of-introduction of the virus is a parameter of the harmonic SIR model
- R code for SIR model simulation with harmonic transmission rate
- Example of aggregating a data set by some quantity Continue reading

# AML610 module XI: practical problems when connecting deterministic models to data

**Some (potentially) useful utilities for random number generation and manipulating vectors in C++**

I’ve written some C++ code mainly related to vectors; calculating the weighted mean, running sum, extracting every nth element, etc). There are also utilities related to random number generation from various probability distributions, and methods to calculate the CDF of various probability distributions.

The file UsefulUtils.h and UsefulUtils.cpp contain source code of a class that contains these utilities that can be useful when performing compartmental modelling in C++. These utilities will be used extensively in the examples that will be presented in this, and later, modules. The file example_useful_utils.cpp gives examples of the use of the class. It can be compiled with the makefile makefile_use with the command

make -f makefile_use example_useful_utils

**Homework #4, due April 3rd, 2013 at 6pm. The data for the homework can be found here.**

# Numerical methods to solve ordinary differential equations

**After going through this module, students will be familiar with the Euler and Runge-Kutta methods for numerical solution of systems of ordinary differential equations. Examples are provided to show students how complementary R scripts can be written to help debug Runge-Kutta methods implemented in C++.**

**Contents**

- Euler’s method (finite difference method)
- Using the Euler method to solve dN/dt = rho*N
- Implementing the Euler method in C++
- Comparison of the output of the C++ and R programs implementing the Euler method to solve dN/dt=rho*N
- Dynamic time step calculation in the Euler method
- Runge-Kutta method
- Example of implementation of Runge Kutta in C++: Lotka-Volterra predator prey model

# ASU AML 610: probability distributions important to modelling in the life and social sciences

**[After reading this module, students should be familiar with probability distributions most important to modelling in the life and social sciences; Uniform, Normal, Poisson, Exponential, Gamma, Negative Binomial, and Binomial.]**

Contents:

Introduction

Probability distributions in general

Probability density functions

Mean, variance, and moments of probability density functions

Mean, variance, and moments of a sample of random numbers

Uncertainty on sample mean and variance, and hypothesis testing

The Poisson distribution

The Exponential distribution

The memory-less property of the Exponential distribution

The relationship between the Exponential and Poisson distributions

The Gamma and Erlang distributions

The Negative Binomial distribution

The Binomial distribution

There are various probability distributions that are important to be familiar with if one wants to model the spread of disease or biological populations (especially with stochastic models). In addition, a good understanding of these various probability distributions is needed if one wants to fit model parameters to data, because the data always have underlying stochasticity, and that stochasticity feeds into uncertainties in the model parameters. It is important to understand what kind of probability distributions typically underlie the stochasticity in epidemic or biological data.

Continue reading