In many health services applications, research to determine the effectiveness of a particular treatment cannot be carried out using a controlled clinical trial. In settings such as these, observational studies must be used. Propensity score methods are useful tools to employ in order to balance the distribution of covariates between treatment groups and hence reduce the potential bias in treatment effect estimates in observational studies. A challenge in many health services research studies is the presence of missing data among the covariates that need to be balanced. In this paper, we compare three simple propensity models using data that examine the effectiveness of self-monitoring of blood glucose (SMBG) in reducing hemoglobin A1c in a cohort of 10,566 type 2 diabetics. The first propensity score model uses only subjects with complete case data (n=6,687), the second incorporates missing value indicators into the model, and the third fits separate propensity scores for each pattern of missing data. We compare the results of these methods and find that incorporating missing data into the propensity score model reduces the estimated effect of SMBG on hemoglobin A1c by more than 10%, although this reduction was not clinically significant. In addition, beginning with the complete data, we artificially introduce missing data using a nonignorable missing data mechanism and compare treatment effect estimates using the three propensity score methods and a simple analysis of covariance (ANCOVA) method. In these analyses, we find that the complete case analysis and the ANCOVA method both perform poorly, the missing value indicator model performs moderately well, and the pattern mixture model performs even better in estimating the original treatment effect observed in the complete data prior to the introduction of artificial missing data. We conclude that in observational studies one must not only adjust for potentially confounding variables using methods such as propensity scores, but one should also account for missing data in these models in order to allow for causal inference more appropriately to be applied.