AML 610 Fall seminar series: Module IV linear regression

In this module, students will become familiar with linear regression methods. Note that before proceeding with any regression analysis, it is important to first perform initial data exploration, both with visualization analysis with histograms, boxplots, and scatter plots, and numerical summaries of the variables like the mean, standard deviations, maxima and minima, and correlations between variables.  In this way, you can determine if there are any unexpected “quirks” or problems with the data (and, more often than not, there are).

Continue reading

Exploratory Data Analysis: examples

Exploratory data analysis essentially is the process of getting to know your data by making plots and perhaps doing some simple statistical hypothesis tests.  Getting to know your data is important before starting the process of regression analysis or any kind of more advanced hypothesis testing, because, more often than not, real data will have “issues” that complicate statistical analyses.

Continue reading

Optimizing model parameters to data (aka inverse problems)

[This presentation discusses methods commonly used to optimize the parameters of a mathematical model for population or disease dynamics to pertinent data.  Parameter optimization of such models is complicated by the fact that usually they have no analytic solution, but instead must be solved numerically. Choice of an appropriate "goodness of fit" statistic will be discussed, as will the benefits and drawbacks of various fitting methods such as gradient descent, Markov Chain Monte Carlo, and Latin Hypercube and random sampling.  An example of the application of the some of the methods using simulated data from a simple model for the incidence of a disease in a human population will be presented]

Continue reading