Document Type


Degree Name



Dept. of Public Health & Preventive Medicine


Oregon Health & Science University


The U.S. government Behavioral Risk Factor Surveillance (BRFSS) survey is an important source of demographic and health data. As with many surveys, BRFSS has missing data resulting from non-response. Because it is impossible to know the true value of missing data, the accuracy of imputation methods for real missing data cannot be known. To solve this problem, I created artificially missing data for two demographic variables for which the originally missing amounts were relatively small: age and race/ethnicity. Proportion estimates for imputation methods at 5%, 10%, and 20% artificially missing were compared against proportion estimates for the same variables from other governmental surveys and against the baseline imputation estimates made at the originally missing amounts, which were between 1% and 3%. I compared and contrasted no imputation, BRFSS imputation methods, multiply imputed hotdeck, and multiply imputed model-based imputation. At each level, missing data were artificially created where the missingness depended on the missing value, where it depended on the value of covariates, and where it did not depend on anything measured by the survey. I found that no imputation was by some measures no worse and even marginally better than any imputation method compared. This thesis has limited scope, however, and caution is recommended before researchers using BRFSS or other survey data forego any attempt at using an imputation method.




School of Medicine



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.