AML 612 Fall 2015/Spring 2016: Stochastic Methods

[In these modules, students will become familiar with basic computational methods for stochastic modeling.  We will cover stochastic modeling of epidemics and biological processes using Markov Chain Monte Carlo (MCMC), Stochastic Differential Equations (SDE’s), and Agent Based Models (ABMs, aka Individual Based Models, IBMs).   Stochastic methods are very useful for many different things, so we’ll also discuss other applications throughout the course. Along the way, we will discuss many other things that are critical to your future success as a researcher. How to do literature searches and build an annotated bibliography, how to organize your work, good coding practices, how to write a good research paper, and how to give a good presentation. There is no required textbook for this course.  However, I *highly* recommend Modeling Infectious Diseases in Humans and Animals by Keeling and Rohani.  It is a great introductory- to medium-level book on modeling methods (including stochastic modeling).  Another book that is good, at a medium- to advanced-level, is An Introduction to Stochastic Processes with Applications to Biology, by Linda Allen. NIMBios also has a good web page with information related to stochastic modelling. In addition, a really nice introductory exposition on the topic of stochastic modelling by Priscilla Greenwood and Luis Gordillo can be found here]

Basics of LaTeX and BibTeX for students in mathematical biology/epidemiology

Most journals in the field of mathematical epidemiology/biology require that you submit manuscripts in LaTeX format.

LaTeX is a word processing language, in which you create a document that contains directives to the latex compiler to produce a final document.  LaTeX provides many capabilities not available in Word; perhaps most important to mathematicians, LaTeX enables typesetting of complex equations that would be difficult, if not impossible in Word.

In addition, LaTeX has a reference management package called BibTeX that easily enables citation of references within your document.

In this module, we will discuss some simple examples of document formatting in LaTeX, and describe how to include figures and tables in the document, and cite references.

Continue reading

Good work habits towards being a more effective researcher

Over the years I have developed some habits as I work that help to make me more efficient as a researcher.  Of students that I’ve seen struggle in their doctoral studies, they are always lacking several of the habits in this list.  Of students who excel, they have all (or nearly all of these habits) either because they were mentored in them, or somehow figured it out themselves.

In assigned homework, students will be expected to conform to good coding and plotting practices, and to submit an annotated bibliography in bibtex format if literature searches are required.

Many of these habits overlap with the list of good work habits in jobs in the private sector. Start using these practices now, and reap many benefits as you go along 🙂

Continue reading

MTBI 2015 summer lectures

[In these modules, students will become familiar with basic concepts that will enable them to fit the parameters of a mathematical model, such that the model gives a good description of a data set.  Sources of data useful to mathematical epidemiology will be discussed, including online sources of data, and how to extract data sources from the literature using programs like DataThief.

Goodness-of-fit statistics will be discussed, as will computational methods for finding model parameters that optimize the goodness-of-fit statistic.  In particular, in the examples we will focus on fitting the parameters of compartmental models of disease dynamics to epidemic data.

In passing along the way, we will discuss how to do literature searches, how to build an annotated bibliography in bibtex, how to come up with a solid research question, how to organize your work, and how to write a good research paper (essentially, many of the skills needed to excel at research!)]

Part I

Homework #1

Continue reading

Example using Negative Binomial likelihood for model parameter optimization

In this past module, we discussed using the Pearson chi-squared statistic to determine the best-fit parameters of an SIR model to influenza B data from the 2007-08 Midwest flu season.   In this module, we will discuss how to find the best-fit parameters using the Negative Binomial likelihood instead.

Continue reading

Correcting for over-dispersion when using Pearson chi-squared

In this past module, we discussed the various merits and applicability of the Least Squares, Pearson chi-square, Poisson likelihood, and Negative Binomial likelihood statistics.

And in this past module we discussed how we can use the graphical Monte Carlo method (aka fmin plus a half method) to determine the one-std deviation confidence interval on our parameter hypotheses when using a likelihood statistic, and we also discussed how the Least Squares and Pearson chi-square statistics can be converted to likelihood statistics.

Continue reading

A C++ class for numerically solving ODE’s

In previous modules, we have described how to use methods in the R deSolve library to numerically solve systems of ordinary differential equations, like the SIR model.  The default algorithm underlying the functions in the deSolve library is 4th order Runge-Kutta method, which involves an iterative process to obtain approximate numerical solutions to the differential equations.  Euler’s method is an even simpler method that can be used to estimate solutions to ODE’s, but 4th order Runge-Kutta is a higher order method that is more precise. Continue reading

Estimating parameter confidence intervals when using the Monte Carlo optimization method

[In this module, students will become familiar with estimating parameter confidence intervals when using the Monte Carlo method to estimate the best-fit parameters of a mathematical model to data.]

Continue reading

How to download an R script from the internet and run it

While you can input commands interactively into the R window, it is often more convenient to create a file (usually with a .R extension) that contains all the R code, and then ask R to run (aka: source) the commands in the file.

In the file short_test.R, I have put the R code to do a loop, and print out the numbers one through ten.  To run this script, first you need to create a folder on your computer that we will refer to as your working directory… Have this folder off of your root directory (C: directory in windows, and off of your base user directory in other platforms), and call this folder short_test_dir

Now, in R, you can use the setwd() command to change to that working directory (which tells R that from now on you want it to look only in that directory for files)

For windows, type at  the R command prompt

setwd("C:/short_test_dir")

and for Linux or Max OSX type

setwd("~/short_test_dir")

(the twiddle ~ means your home directory).  If you get an error at this point, it is because you did not properly create the folder short_test_dir under your home folder (or the C: directory in Windows), or you made a spelling mistake, or you forgot one or both quotes.

Now, a problem with Windows is that .R and data files downloaded from the internet tend to be saved with a .txt extension, and it is annoying to constantly have to remove it.  In addition, web browsers running on Windows seem to usually assume that any text file you are looking at on the internet must be HTML, so when downloading such files it puts HTML prefaces at the top, which prevent the files from being run in R.  To get around these problems, it is usually easiest to download the files using the R function download.file().

Thus, for windows, at the R command line type

download.file("http://www.sherrytowers.com/short_test.R","short_test.R")

and for Linux and Mac OSX type

download.file("http://www.sherrytowers.com/short_test.R","short_test.R")

If you get an error at this point, you’ve made a spelling mistake in the URL, or in the local directory name. Or you’ve forgotten one or more quotes.

Now, to run the file, type at the R command prompt

source("short_test.R")

You should see the numbers 1 through 10 being printed to the screen.

If you are creating your own .R file, you need to make sure that (particularly for Windows) a .txt is not appended to the end of it.

Now, using a text editor like Notepad (Windows) or TextEdit (Mac OSX), or whatever text editor you feel comfortable with, change the short_test.R file to print the numbers 11 to 100, but in a line, with each number separated by a space (rather than in a column).  Save the file, then make sure you can run the new file from the R command line.

You will be expected to have completed this exercise on your own before class, and to be adept at downloading, editing, and running R scripts.