Introduction to regression modeling of longitudinal and clustered data from epidemiology and health sciences. Interpretation and familiarity with software gained by analysis of data and critiques of published analyses. Prerequisite: Either BIOST 513, BIOST 515, BIOST 518, BIOST 536, or permission of instructor. Offered: Sp.
The aims of this course are 1) to introduce the concepts of correlated data, to describe the basic structures of correlated data, and to explain how correlation arises in common study designs; 2) to contrast the behavior of correlated data with uncorrelated data and to show how the behavior of correlated data influences design and statistical analysis; 3) to show how to analyse correlated data arising from several common correlated data structures using statistical computing packages such as STATA; and 4) to introduce more advanced topics in the analysis of correlated data.
As suggested by these aims, the course seeks to develop an understanding of correlated data, including how it arises, its implications for statistical inference, and how to accommodate it in statistical analysis. At the end of this course, the student should: 1) be able to recognize correlated data and explain how it arises; 2) understand the impact of correlated data on design and statistical analysis; 3) know the basic structures of correlated data; 4) be able to formulate models for real-life correlated data and correctly interpret the parameters of the model; 5) be able to choose appropriate analysis methods for correlated data and explain them to a non-statistical audience; 6) know how to perform several methods of analysis of correlated data using statistical packages and be able to recognize situations that cannot be addressed by these techniques and that require expert assistance; and 7) be familiar with some of the key references on correlated data and be prepared for the study of more advanced correlated data methods.
Student learning goals
General method of instruction
Class time will consist of lectures and class discussion of assigned readings and homework problems.
The intended audience for this course is graduate students who have had an introduction to biostatistics, and who a) understand basic probability concepts such as random variable, expectation, variance, and correlation; b) understand basic statistical concepts such as the distinction between populations and samples from a population, parameter estimates, standard error, hypothesis test, and confidence interval, and c) are able to carry out statistical analyses such as analysis of variance, linear regression, and logistic regression, and explain them to an epidemiological audience. Familiariaty with standard epidemiologic study designs and their analysis is beneficial, as is previous experience with the statistical package STATA. BIOST 512 & 513 should be adequate preparation for this course, but BIOST 517/518 or BIOST 536 are preferred.
Class assignments and grading
Homework assignments will be assigned weekly. Homework problems will include assigned readings, critiques of journal articles, data analyses, and detailed presentation and interpretation of results. Some assignments will require the use of the statistical package STATA (available in the Health Sciences Library computing lab), and other assignments can be done using a package of the student's choice (SAS and Splus are recommended in addition to STATA).
In addition to homework assignments, there will be two exams: an in-class mid-term exam on May 8, and a final exam on June 13. The exams will include questions that test knowledge of definitions and concepts covered in the notes, understanding of these concepts, knowledge of appropriate methods of analysis in a given situation, and ability to correctly interpret results of a data analysis based on computer output. The final course grades will be based on the following components: Homework Assignments (40%), Mid-Term Exam (30%), Final Exam (30%).