Example problems involving hypothesis testing

- Read in the chicago_pollution.txt file (pollution data for Chicago between 2001 to 2012)
- Aggregate the NO2 by weekday, taking the mean. Aggregate NO2 by weekday, taking the standard error (note that fall_course_libs.R has an se() function that calculates the standard error on the mean of a sample). Calculate the p-value to test the null hypothesis that the average NO2 concentration does not depend on weekday (what is the statistic you will use to do this?). What is the peak day for NO2?
- Do the same for SO2. And particulate matter. Do the same for ozone.
- Aggregate ozone by day-of-year for data from 2005 onwards and calculate the mean and standard error on the mean. Test the null hypothesis that the mean of the ozone concentration on day 359 (Dec 25th) are different than the ozone concentration on day 357 (what is the statistic you will use to do this?).
- Aggregate ozone by day-of-year for all years and calculate the mean and standard error on the mean. Test the null hypothesis that the mean of the ozone concentration on day 359 (Dec 25th) are different than the ozone concentration on day 357 (what is the statistic you will use to do this?).
- Read in the file chicago_weather_summary.txt and calculate the correlation between the daily ozone data and the daily average temperature. Are they significantly correlated? Calculate the correlation between the daily particulate matter data and the daily average temperature, and determine if they are significantly correlated.
- Read in the file chicago_crime_summary_b.txt and aggregate the number of robberies by weekday. Evaluate the p-value testing the null hypothesis that the number of robberies is the same throughout the week (what statistic would you use to do this?)
- Now aggregate the total number of crimes by weekday. Calculate the p-value testing the null hypothesis that the fraction of crimes that are robberies is constant throughout the week (what statistic would you use to do this?)

Visits: 5357