The authors aimed to use health records data to examine how the accuracy of statistical models predicting self-harm or suicide changed between 2015 and 2019, as health systems implemented suicide prevention programs. Data from four large health systems were used to identify specialty mental health visits by patients ages ≥11 years, assess 311 potential predictors of self-harm (including demographic characteristics, historical risk factors, and index visit characteristics), and ascertain fatal or nonfatal self-harm events over 90 days after each visit. New prediction models were developed with logistic regression with LASSO (least absolute shrinkage and selection operator) in random samples of visits (65%) from each calendar year and were validated in the remaining portion of the sample (35%). A model developed for visits from 2009 to mid-2015 showed similar classification performance and calibration accuracy in a new sample of about 13.1 million visits from late 2015 to 2019. Area under the receiver operating characteristic curve (AUC) ranged from 0.840 to 0.849 in the new sample, compared with 0.851 in the original sample. New models developed for each year for 2015-2019 had classification performance (AUC range 0.790-0.853), sensitivity, and positive predictive value similar to those of the previously developed model. Models selected similar predictors from 2015 to 2019, except for more frequent selection of depression questionnaire data in later years, when questionnaires were more frequently recorded. A self-harm prediction model developed with 2009-2015 visit data performed similarly when applied to 2015-2019 visits. New models did not yield superior performance or identify different predictors.