Browse the corpus

Walk the Even Hospital Database by book and chapter — the raw source passages that ground Ask, DDx, and the rest.

46 passages

abstractpubmed· Abstract· item 39869633

Longitudinal Reliability of Milestones Learning Trajectories during Anesthesiology Residency. BACKGROUND: Longitudinal Milestones data reported to the Accreditation Council for Graduate Medical Education provide a structured framework for assessing the developmental progression of residents in key competencies and subcompetencies. This study aims to investigate the previously underexplored longitudinal reliability of Milestones data, with the goal of identifying patterns in learning trajectories that can inform targeted interventions for residents and programs. METHODS: A retrospective cohort study was conducted with national anesthesiology Milestones data collected from 2014 to 2020. Mixed-effects growth curve models were fit to model residents' growth trajectories. Longitudinal reliability was assessed using the indices of growth rate reliability and growth curve reliability. This study also examined variance components attributable to the factors at both the learner and program levels. Latent class growth analyses were performed to identify latent groups of learners with different learning trajectories. RESULTS: The study included a total of 682,475 ratings for 4,976 learners in 140 programs. Growth curve model results indicated that the mean baseline Milestone rating across the 25 subcompetencies was 2.05 (95% CI, 1.96 to 2.14), with an average increase of 0.49 (95% CI, 0.48 to 0.51) units per reporting period. The growth rate reliability (mean ± SD, 0.58 ± 0.03) suggested a moderate capability of anesthesiology Milestones to detect individual differences in the growth of latent competency. Growth curve reliability estimates (mean ± SD, 0.71 ± 0.02) suggested acceptable overall reliability of Milestones across all the six assessment points. Significant variability was observed at both the program and learner levels ( P < 0.001). Latent class growth analyses identified 3 to 4 latent groups of learners with distinct learning trajectories across the 25 subcompetencies. CONCLUSIONS: The study indicated that the anesthesiology Milestones provide moderately reliable information for tracking individual progress over time. The findings underscore the importance of using a multifaceted approach to assessment and providing individualized learning plans to support resident development.

fulltextpubmed· Editor’s Perspective· item 39869633

The Accreditation Council for Graduate Medical Education (ACGME) Anesthesiology Milestones were implemented into anesthesiology residency programs in 2014, but little is known about their efficacy Based on the ACGME reports between 2014 and 2020, this study indicates that the anesthesiology Milestones version 1.0 provide moderately reliable information for tracking an individual’s progress over time and may provide an opportunity for program directors to implement individualized learning plans to support resident development Longitudinal Milestones data reported to the Accreditation Council for Graduate Medical Education (ACGME) provide a structured framework for assessing the developmental progression of residents in key competencies and subcompetencies.1 These Milestones track residents’ progress along developmental trajectories, highlighting key points of growth and aiding in curriculum and assessment planning. Further investigation is warranted to strengthen the validity evidence supporting the use and interpretation of Milestones data, particularly in terms of content, predictive, and construct validity. A longitudinal and developmental perspective on assessment data is crucial to understanding how trainee competence evolves over time. This research has the potential to enhance educational interventions, improve learner support, and bolster confidence in high-stakes decisions based on Milestones data.2–4

fulltextpubmed· Editor’s Perspective· item 39869633

ruct validity. A longitudinal and developmental perspective on assessment data is crucial to understanding how trainee competence evolves over time. This research has the potential to enhance educational interventions, improve learner support, and bolster confidence in high-stakes decisions based on Milestones data.2–4 Learning trajectories illustrate the pattern and rate at which learners acquire competencies toward unsupervised practice.5,6 Anesthesiology Milestones, as assessed by both residents and faculty, also have a positive linear relationship with postgraduate year.7 While previous research has examined Milestones data at the national level8,9 and identified variations in individual learning trajectories, the longitudinal reliability of these trajectories in anesthesia remains underexplored. These findings suggest that the Milestones reporting system provides reliable longitudinal data for individualized tracking of progress in all subcompetencies.

fulltextpubmed· Editor’s Perspective· item 39869633

level8,9 and identified variations in individual learning trajectories, the longitudinal reliability of these trajectories in anesthesia remains underexplored. These findings suggest that the Milestones reporting system provides reliable longitudinal data for individualized tracking of progress in all subcompetencies. In this study, we leveraged a national longitudinal cohort of anesthesiology residents to investigate the previously underexplored longitudinal reliability of Milestones 1.0 data. Formally implemented in 2014, Milestones 1.0 provided a structured framework for assessing resident progression across six core competencies, encompassing 25 subcompetencies with detailed behavioral anchors. This version served as the foundation for residency training and assessment for nearly a decade, offering rich, longitudinal data for analysis. Our goal is to uncover broader patterns in resident learning trajectories, going beyond simply identifying individuals who may be struggling. These patterns can inform the timing and nature of targeted interventions for both residents and programs, contribute to data-driven curriculum development, and enable program benchmarking through comparisons with national data.

fulltextpubmed· What This Article Tells Us That Is New· item 39869633

Based on the ACGME reports between 2014 and 2020, this study indicates that the anesthesiology Milestones version 1.0 provide moderately reliable information for tracking an individual’s progress over time and may provide an opportunity for program directors to implement individualized learning plans to support resident development

fulltextpubmed· Materials and Methods· item 39869633

This is a retrospective cohort design with national anesthesiology Milestones 1.0 data collected from 2014 to 2020. Four cohorts of residents who entered training between 2014 and 2017 (i.e., postgraduate year 2 or clinical anesthesia year 1) and graduated between 2017 and 2020 (i.e., postgraduate year 4 or clinical anesthesia year 3) were analyzed, resulting in 4,976 residents from 140 programs. Participants’ information was deidentified for analysis purposes. This study was deemed exempt by the Institutional Review Board of Stanford University (eProtocol No. 70214). The data were provided by the ACGME.

fulltextpubmed· Materials and Methods· item 39869633

raduate year 4 or clinical anesthesia year 3) were analyzed, resulting in 4,976 residents from 140 programs. Participants’ information was deidentified for analysis purposes. This study was deemed exempt by the Institutional Review Board of Stanford University (eProtocol No. 70214). The data were provided by the ACGME. The anesthesiology Milestones 1.0 comprise 25 subcompetencies across six core dimensions: patient care, medical knowledge, system-based practice, practice-based learning and improvement, professionalism, and interpersonal and communication skills. The number of subcompetencies for each of the six core competencies ranges from 1 subcompetency for medical knowledge to 10 subcompetencies for patient care. Each subcompetency is rated on a 10-point rating scale with 0.5-unit interval, from level 0 (has not yet achieved Milestone level 1) to 5 (achievement greater than expected). While the target goal of training is to attain a level of 4 by the time of program completion, indicating readiness for unsupervised practice, this is not a mandatory requirement for graduation.10 The clinical competency committee within each program conducted semiannual assessments of residents, resulting in six evaluations over their 3-yr training period. Previous research provided validity evidence for the use and interpretation of Milestones scores.11

fulltextpubmed· Materials and Methods· item 39869633

ce, this is not a mandatory requirement for graduation.10 The clinical competency committee within each program conducted semiannual assessments of residents, resulting in six evaluations over their 3-yr training period. Previous research provided validity evidence for the use and interpretation of Milestones scores.11 Descriptive statistics, such as the median and interquartile range of Milestones ratings, are displayed in box plots for the six core competencies and each of the subcompetencies. Mixed-effects growth curve models were fit to model growth trajectories of anesthesiology residents, with their milestone ratings as the outcome variable and the 6-month milestones reporting period as the predictor. Quadratic time was included to account for nonlinear growth. The reporting periods were coded from “0” to “5,” where “0” represents the first reporting period at the midpoint of postgraduate year 2/clinical anesthesia year 1, and “5” represents the final reporting period of training (i.e., the postgraduate year 4/clinical anesthesia year 3 end-of-year reporting period). Estimates of intercepts and slopes were specified as randomly varying among learners and programs to account for hierarchically nested data.

fulltextpubmed· Materials and Methods· item 39869633

ar 2/clinical anesthesia year 1, and “5” represents the final reporting period of training (i.e., the postgraduate year 4/clinical anesthesia year 3 end-of-year reporting period). Estimates of intercepts and slopes were specified as randomly varying among learners and programs to account for hierarchically nested data. Longitudinal reliability was assessed using the indices of growth rate reliability and growth curve reliability.12–15 Growth rate reliability measures the ability to differentiate individual differences in growth process and is quantified by the proportion of variance attributable to the growth rate.15 Growth curve reliability, on the other hand, measures the static reliability of assessment at a particular occasion, accounting for the variance related to growth rate, intercept, and their covariance.16 Growth rate reliability and growth curve reliability are complementary, providing comprehensive information on the consistency for longitudinal assessments.16 In our study, we calculated growth rate reliability and growth curve reliability using variances and covariances estimated in the growth curve models. We also examined variance components attributable to the factors at both the learner and the program levels.

fulltextpubmed· Materials and Methods· item 39869633

formation on the consistency for longitudinal assessments.16 In our study, we calculated growth rate reliability and growth curve reliability using variances and covariances estimated in the growth curve models. We also examined variance components attributable to the factors at both the learner and the program levels. Latent class growth analyses were conducted to identify latent groups of learners with different learning trajectories. First, a baseline, a single-group model with linear and quadratic models were fit to the data to find the best single-group representation of change. Next, models with classes ranging from 2 to 10 were fit and selected based on fit indices, including the Akaike information criterion, Bayesian information criterion, Vuong–Lo–Mendell–Rubin likelihood ratio test, classification error, entropy R2, and R2. The growth pattern of each group was reviewed and compared across subcompetencies. Analyses were conducted for each of the 25 subcompetencies separately. The data were analyzed using R version 4.3.1 (www.r-project.org), and latent class growth analyses were conducted using LatentGOLD version 6.0. Statistical significance was presumed at P < 0.05 (two-tailed test).

fulltextpubmed· Anesthesiology Milestones· item 39869633

The anesthesiology Milestones 1.0 comprise 25 subcompetencies across six core dimensions: patient care, medical knowledge, system-based practice, practice-based learning and improvement, professionalism, and interpersonal and communication skills. The number of subcompetencies for each of the six core competencies ranges from 1 subcompetency for medical knowledge to 10 subcompetencies for patient care. Each subcompetency is rated on a 10-point rating scale with 0.5-unit interval, from level 0 (has not yet achieved Milestone level 1) to 5 (achievement greater than expected). While the target goal of training is to attain a level of 4 by the time of program completion, indicating readiness for unsupervised practice, this is not a mandatory requirement for graduation.10 The clinical competency committee within each program conducted semiannual assessments of residents, resulting in six evaluations over their 3-yr training period. Previous research provided validity evidence for the use and interpretation of Milestones scores.11

fulltextpubmed· Statistical Analysis· item 39869633

Descriptive statistics, such as the median and interquartile range of Milestones ratings, are displayed in box plots for the six core competencies and each of the subcompetencies. Mixed-effects growth curve models were fit to model growth trajectories of anesthesiology residents, with their milestone ratings as the outcome variable and the 6-month milestones reporting period as the predictor. Quadratic time was included to account for nonlinear growth. The reporting periods were coded from “0” to “5,” where “0” represents the first reporting period at the midpoint of postgraduate year 2/clinical anesthesia year 1, and “5” represents the final reporting period of training (i.e., the postgraduate year 4/clinical anesthesia year 3 end-of-year reporting period). Estimates of intercepts and slopes were specified as randomly varying among learners and programs to account for hierarchically nested data.

fulltextpubmed· Results· item 39869633

Our study analyzed a total of 682,475 ratings collected between midyear 2014 to midyear 2020 for 4,976 learners in 140 programs. The demographic characteristics of the participants are detailed in table 1. The majority of learners (n = 3,189, 64.1%) were male and approximately half were white (n = 2,540, 51.0%). Figure 1 visualizes the longitudinal development of Milestones ratings for core competencies across six reporting periods using box plots. For each core competency, the median [interquartile range] of midyear ratings for postgraduate year 2 residents was 2.0 [1.5, 2.5], while end-of-year medians [interquartile range] for postgraduate year 4 residents was 4.0 [4.0, 4.5]. Box plots of Milestones ratings for the subcompetencies are shown in Supplemental Digital Content figure 1 (https://links.lww.com/ALN/D839). Among all the learners, 1,476 (37%) did not reach the target goal of level 4 for at least one subcompetency during the last reporting period. The rate of straight-lining among final year residents was 26.0%. Straight-lining refers to the identical Milestone ratings that a resident received across all subcompetencies during one rating period.17 The subcompetencies with the largest proportions of learners not meeting the level 4 criteria upon graduation included medical knowledge 1 (n = 706, 18%), practice-based learning and improvement 1 (n = 479, 12%), interpersonal and communication skills 3 (n = 370, 9%), system-based practice 2 (n = 366, 9%), practice-based learning and improvement 2 (n = 362, 9%), and patient care 7 (n = 360, 9%).

fulltextpubmed· Results· item 39869633

evel 4 criteria upon graduation included medical knowledge 1 (n = 706, 18%), practice-based learning and improvement 1 (n = 479, 12%), interpersonal and communication skills 3 (n = 370, 9%), system-based practice 2 (n = 366, 9%), practice-based learning and improvement 2 (n = 362, 9%), and patient care 7 (n = 360, 9%). Number and Percentage of 4,976 Learners from 140 Programs by Sex and Ethnicity Milestone levels by core competency: (A) patient care (PC); (B) medical knowledge (MK); (C) system-based practice (SBP); (D) practice-based learning and improvement (PBLI); (E) professionalism (PR); (F) interpersonal and communication skills (ICS). Box plots by reporting period for 4,976 learners in 140 programs display the median (i.e., the line inside the box), the interquartile range (i.e., the box), and the whiskers, which represent the range within 1.5 times the interquartile range from the first and third quartiles. Any data points outside this range are considered outliers and are plotted as individual points (i.e., dots). This visual representation highlights the central tendency, variability, and potential outliers in Milestone ratings over time for each core competency. Y, year.

fulltextpubmed· Results· item 39869633

artile range from the first and third quartiles. Any data points outside this range are considered outliers and are plotted as individual points (i.e., dots). This visual representation highlights the central tendency, variability, and potential outliers in Milestone ratings over time for each core competency. Y, year. Supplemental Digital Content table 1 (https://links.lww.com/ALN/D841) presents the growth curve model estimates of baseline Milestones ratings and growth rate for each subcompetency. The mean baseline Milestone rating (i.e., midyear rating for postgraduate year 2 residents) across the 25 subcompetencies was 2.05 (95% CI, 1.96 to 2.14). Overall, professionalism had the highest initial ratings (mean intercept, 2.15; 95% CI, 2.06 to 2.23), while patient care had the lowest (mean intercept, 1.99; 95% CI, 1.90 to 2.08). Notably, patient care 7 had the lowest initial rating (intercept, 1.71; 95% CI, 1.60 to 1.82) among all the subcompetencies. Milestone ratings for each subcompetency increased significantly by reporting period (all P < 0.001). On average, Milestones increased 0.49 (95% CI, 0.48 to 0.51) units per reporting period across subcompetencies. Subcompetencies in patient care had the highest growth rates over time (mean slope, 0.51; 95% CI, 0.50 to 0.53; and patient care 7 slope, 0.62; 95% CI, 0.60 to 0.64), while medical knowledge (slope, 0.46; 95% CI, 0.44 to 0.48) and professionalism (mean slope, 0.47; 95% CI, 0.45 to 0.48) had the lowest. Coefficient estimates for quadratic time were all statistically significant (all P < 0.001), suggesting nonlinear trajectories.

fulltextpubmed· Results· item 39869633

and patient care 7 slope, 0.62; 95% CI, 0.60 to 0.64), while medical knowledge (slope, 0.46; 95% CI, 0.44 to 0.48) and professionalism (mean slope, 0.47; 95% CI, 0.45 to 0.48) had the lowest. Coefficient estimates for quadratic time were all statistically significant (all P < 0.001), suggesting nonlinear trajectories. Table 2 shows the growth rate reliability estimates and the growth curve reliability estimates for the first assessment occasion by Milestones subcompetency. The longitudinal reliability based on growth rate reliability ranged from 0.52 to 0.62 (mean ± SD, 0.58 ± 0.03), suggesting a moderate capability of anesthesiology Milestones to detect individual heterogeneity in true growth of latent competency.8 The mean ± SD of growth curve reliability estimate was 0.71 ± 0.02, indicating acceptable overall reliability of Milestones assessment across all the six assessment points according to standards provided by previous literature.8 Longitudinal Reliability and Variance at Learner and Program Levels for 4,976 Learners in 140 Programs

fulltextpubmed· Results· item 39869633

Table 2 shows the growth rate reliability estimates and the growth curve reliability estimates for the first assessment occasion by Milestones subcompetency. The longitudinal reliability based on growth rate reliability ranged from 0.52 to 0.62 (mean ± SD, 0.58 ± 0.03), suggesting a moderate capability of anesthesiology Milestones to detect individual heterogeneity in true growth of latent competency.8 The mean ± SD of growth curve reliability estimate was 0.71 ± 0.02, indicating acceptable overall reliability of Milestones assessment across all the six assessment points according to standards provided by previous literature.8 Longitudinal Reliability and Variance at Learner and Program Levels for 4,976 Learners in 140 Programs Growth rate reliability measures the reliability of growth rate (i.e., the ability to differentiate individual differences in growth process) and is quantified by the proportion of variance attributable to the growth rate. Growth curve reliability measures the static reliability of assessment at a particular occasion, accounting for the variance related to growth rate, intercept, and their covariance. Growth rate reliability and growth curve reliability were calculated using variance and covariance estimates from the growth curve models. Growth rate reliability estimates suggested a moderate capability of anesthesiology Milestones to detect individual heterogeneity in growth of latent competency, and growth curve reliability estimates indicated acceptable overall reliability of Milestones assessment across all the six assessment points according to standards provided by previous literature.8 These estimates can be found in Supplemental Digital Content table 1 (https://links.lww.com/ALN/D841).

fulltextpubmed· Results· item 39869633

latent competency, and growth curve reliability estimates indicated acceptable overall reliability of Milestones assessment across all the six assessment points according to standards provided by previous literature.8 These estimates can be found in Supplemental Digital Content table 1 (https://links.lww.com/ALN/D841). Random-effects variance displays proportions of variance associated with the baseline Milestones ratings (i.e., intercept) and growth rate (i.e., slope) among programs and learners, respectively. The proportion of variance was calculated by dividing the variance associated with each component by the total variance. ACGME, Accreditation Council for Graduate Medical Education; ICS, interpersonal and communication skills; MK, medical knowledge; PBLI, practice-based learning and improvement; PC, patient care; PR, professionalism; SBP, system-based practice.

fulltextpubmed· Results· item 39869633

Random-effects variance displays proportions of variance associated with the baseline Milestones ratings (i.e., intercept) and growth rate (i.e., slope) among programs and learners, respectively. The proportion of variance was calculated by dividing the variance associated with each component by the total variance. ACGME, Accreditation Council for Graduate Medical Education; ICS, interpersonal and communication skills; MK, medical knowledge; PBLI, practice-based learning and improvement; PC, patient care; PR, professionalism; SBP, system-based practice. The random effect estimates at the program level were all statistically significant, indicating substantial variability among programs in both baseline Milestone ratings and growth rates (all P < 0.001). Baseline Milestones ratings varied by a mean ± SD of 0.52 ± 0.04 units, and the rate of development varied by a mean ± SD of 0.08 ± 0.01 units. The amount of variance was similar across subcompetencies except for patient care 7 and 10, which showed higher variability in both the baseline (0.67 for patient care 7; 0.61 for patient care 10) and the rate of increase (0.11 for patient care 7; 0.10 for patient care 10). We also noted that program-level variances tended to decrease over time, as indicated by negative random-effect correlations (mean r = −0.67; 95% CI, −0.76 to −0.58). This pattern was consistent across subcompetencies, with patient care 7 showing the largest reduction in variance, while medical knowledge 1 showed the smallest. Program level accounted for the largest proportion of total variance, with 69% for the baseline rating and only 2% for the rate of growth.

fulltextpubmed· Results· item 39869633

I, −0.76 to −0.58). This pattern was consistent across subcompetencies, with patient care 7 showing the largest reduction in variance, while medical knowledge 1 showed the smallest. Program level accounted for the largest proportion of total variance, with 69% for the baseline rating and only 2% for the rate of growth. Significant variation was also observed among individual learners (all P < 0.001), although the amount of variance was smaller compared to program-level variance (20% for baseline rating and 1% for rate of increase). Like program-level estimates, the variations among learners decreased significantly over time (mean r = −0.68; 95% CI, −0.72 to −0.65). Notably, patient care 7 and 10 showed the greatest decrease, while the variance of medical knowledge 1 ratings among learners remained relatively stable over time.

fulltextpubmed· Results· item 39869633

r rate of increase). Like program-level estimates, the variations among learners decreased significantly over time (mean r = −0.68; 95% CI, −0.72 to −0.65). Notably, patient care 7 and 10 showed the greatest decrease, while the variance of medical knowledge 1 ratings among learners remained relatively stable over time. Latent class growth analyses identified multiple groups of learners with different learning trajectories for each subcompetency. The number and percentage of learners in each group are detailed in table 3. Specifically, three latent groups were identified for five subcompetencies (patient care 1, 4, and 9; medical knowledge 1; and system-based practice 1), while four latent groups were for the remaining 20 subcompetencies. Notably, 18 subcompetencies included groups of learners who did not achieve the level of 4 upon graduation, with the subcompetencies having the largest percentage of these learners including medical knowledge 1 (15%), practice-based learning and improvement 1 (15%), practice-based learning and improvement 2 (12%), professionalism 3 (12%), and professionalism 5 (12%). Number and Percentage of Learners Assigned to Different Learning Trajectory Groups by Subcompetency ACGME, Accreditation Council for Graduate Medical Education; ICS, interpersonal and communication skills; MK, medical knowledge; PBLI, practice-based learning and improvement; PC, patient care; PR, professionalism; SBP, system-based practice. Number and percentage of learners in different learning trajectory groups, ordered in descending baseline Milestones ratings.

fulltextpubmed· Results· item 39869633

ACGME, Accreditation Council for Graduate Medical Education; ICS, interpersonal and communication skills; MK, medical knowledge; PBLI, practice-based learning and improvement; PC, patient care; PR, professionalism; SBP, system-based practice. Number and percentage of learners in different learning trajectory groups, ordered in descending baseline Milestones ratings. Latent trajectory subgroup that did not meet graduation target of level 4.

fulltextpubmed· Results· item 39869633

ACGME, Accreditation Council for Graduate Medical Education; ICS, interpersonal and communication skills; MK, medical knowledge; PBLI, practice-based learning and improvement; PC, patient care; PR, professionalism; SBP, system-based practice. Number and percentage of learners in different learning trajectory groups, ordered in descending baseline Milestones ratings. Latent trajectory subgroup that did not meet graduation target of level 4. The identified three or four latent groups displayed similar developmental trajectories across subcompetencies, particularly for the first two groups. The first group of learners had the highest Milestone ratings at baseline and followed a negatively accelerated curve, showing rapid initial improvement that decreases with time, plateauing by the end of year 3 and then leveling off. The second group displays linear progression with time. The last group started with the lowest Milestone ratings and showed a growth pattern where the rate of progress increased slightly over time, but they still could not reach the target goal of training (e.g., medical knowledge 1). For subcompetencies with the four groups, the last two groups initially converged but diverged gradually, resulting in a one group not achieving the target goal of level 4 by the end of training (e.g., practice-based learning and improvement 2 to 4, professionalism 1 to 5, interpersonal and communication skills 3, and patient care 2). Figure 2 shows the latent groups of distinct growth curves for patient care 2, medical knowledge 1, practice-based learning and improvement 2, and professionalism 4. For the subcompetencies of patient care 2 and medical knowledge 1, the learning trajectory of the last group (i.e., group 4 for patient care 2 and group 3 for medical knowledge 1, which include learners who were estimated to not meet the target goal of level 4 upon graduation) diverged from other groups starting from the middle of postgraduate year 2, whereas for professionalism 3 and practice-based learning and improvement 2, the learning trajectory of group 4 began to diverge from the end of postgraduate year 2. The latent groups of learners with different developmental trajectories for all the subcompetencies are presented in Supplemental Digital Content figure 2 (https://links.lww.com/ALN/D840). For the other subcompetencies, the critical periods for the identification of at-risk trainees appear to be either midyear postgraduate year 2 or end-year postgraduate year 2.

fulltextpubmed· Results· item 39869633

developmental trajectories for all the subcompetencies are presented in Supplemental Digital Content figure 2 (https://links.lww.com/ALN/D840). For the other subcompetencies, the critical periods for the identification of at-risk trainees appear to be either midyear postgraduate year 2 or end-year postgraduate year 2. Growth trajectories of latent groups for the subcompetencies of PC-2 (A), MK-1 (B), PBLI-2 (C), and PR-3 (D). Learners in group 3 (MK-1) or group 4 (PC-2, PBLI-2, and PR-3) did not achieve the target goal of training upon graduation. These groups began to diverge either in the middle of postgraduate year 2 (PC-2 and MK-1) or at the end of postgraduate year 2 (PBLI-2 and PR-3), suggesting that these points are critical for remediation for these learners. PC-2, Patient Care 2 (Anesthetic Plan and Conduct); MK-1, Medical Knowledge 1 (knowledge of biomedical, clinical, epidemiologic, and social behavioral sciences as outlined in the American Board of Anesthesiology content outline); PBLI-2, Practice-Based Learning and Improvement 2 (analysis of practice to identify areas in need of improvement); PR-3, Professionalism 3 (commitment to institution, department, and colleagues).

fulltextpubmed· Milestones Baseline and Growth Rate· item 39869633

Supplemental Digital Content table 1 (https://links.lww.com/ALN/D841) presents the growth curve model estimates of baseline Milestones ratings and growth rate for each subcompetency. The mean baseline Milestone rating (i.e., midyear rating for postgraduate year 2 residents) across the 25 subcompetencies was 2.05 (95% CI, 1.96 to 2.14). Overall, professionalism had the highest initial ratings (mean intercept, 2.15; 95% CI, 2.06 to 2.23), while patient care had the lowest (mean intercept, 1.99; 95% CI, 1.90 to 2.08). Notably, patient care 7 had the lowest initial rating (intercept, 1.71; 95% CI, 1.60 to 1.82) among all the subcompetencies. Milestone ratings for each subcompetency increased significantly by reporting period (all P < 0.001). On average, Milestones increased 0.49 (95% CI, 0.48 to 0.51) units per reporting period across subcompetencies. Subcompetencies in patient care had the highest growth rates over time (mean slope, 0.51; 95% CI, 0.50 to 0.53; and patient care 7 slope, 0.62; 95% CI, 0.60 to 0.64), while medical knowledge (slope, 0.46; 95% CI, 0.44 to 0.48) and professionalism (mean slope, 0.47; 95% CI, 0.45 to 0.48) had the lowest. Coefficient estimates for quadratic time were all statistically significant (all P < 0.001), suggesting nonlinear trajectories.

fulltextpubmed· Longitudinal Reliability and Variability· item 39869633

Table 2 shows the growth rate reliability estimates and the growth curve reliability estimates for the first assessment occasion by Milestones subcompetency. The longitudinal reliability based on growth rate reliability ranged from 0.52 to 0.62 (mean ± SD, 0.58 ± 0.03), suggesting a moderate capability of anesthesiology Milestones to detect individual heterogeneity in true growth of latent competency.8 The mean ± SD of growth curve reliability estimate was 0.71 ± 0.02, indicating acceptable overall reliability of Milestones assessment across all the six assessment points according to standards provided by previous literature.8 Longitudinal Reliability and Variance at Learner and Program Levels for 4,976 Learners in 140 Programs

fulltextpubmed· Program-level Variance· item 39869633

The random effect estimates at the program level were all statistically significant, indicating substantial variability among programs in both baseline Milestone ratings and growth rates (all P < 0.001). Baseline Milestones ratings varied by a mean ± SD of 0.52 ± 0.04 units, and the rate of development varied by a mean ± SD of 0.08 ± 0.01 units. The amount of variance was similar across subcompetencies except for patient care 7 and 10, which showed higher variability in both the baseline (0.67 for patient care 7; 0.61 for patient care 10) and the rate of increase (0.11 for patient care 7; 0.10 for patient care 10). We also noted that program-level variances tended to decrease over time, as indicated by negative random-effect correlations (mean r = −0.67; 95% CI, −0.76 to −0.58). This pattern was consistent across subcompetencies, with patient care 7 showing the largest reduction in variance, while medical knowledge 1 showed the smallest. Program level accounted for the largest proportion of total variance, with 69% for the baseline rating and only 2% for the rate of growth.

fulltextpubmed· Learner-level Variance· item 39869633

Significant variation was also observed among individual learners (all P < 0.001), although the amount of variance was smaller compared to program-level variance (20% for baseline rating and 1% for rate of increase). Like program-level estimates, the variations among learners decreased significantly over time (mean r = −0.68; 95% CI, −0.72 to −0.65). Notably, patient care 7 and 10 showed the greatest decrease, while the variance of medical knowledge 1 ratings among learners remained relatively stable over time.

fulltextpubmed· Latent Groups of Different Learning Trajectories· item 39869633

Latent class growth analyses identified multiple groups of learners with different learning trajectories for each subcompetency. The number and percentage of learners in each group are detailed in table 3. Specifically, three latent groups were identified for five subcompetencies (patient care 1, 4, and 9; medical knowledge 1; and system-based practice 1), while four latent groups were for the remaining 20 subcompetencies. Notably, 18 subcompetencies included groups of learners who did not achieve the level of 4 upon graduation, with the subcompetencies having the largest percentage of these learners including medical knowledge 1 (15%), practice-based learning and improvement 1 (15%), practice-based learning and improvement 2 (12%), professionalism 3 (12%), and professionalism 5 (12%). Number and Percentage of Learners Assigned to Different Learning Trajectory Groups by Subcompetency ACGME, Accreditation Council for Graduate Medical Education; ICS, interpersonal and communication skills; MK, medical knowledge; PBLI, practice-based learning and improvement; PC, patient care; PR, professionalism; SBP, system-based practice. Number and percentage of learners in different learning trajectory groups, ordered in descending baseline Milestones ratings. Latent trajectory subgroup that did not meet graduation target of level 4.

fulltextpubmed· Discussion· item 39869633

Residency is a crucial developmental stage in a physician’s journey toward becoming a specialist. Competency frameworks in education inherently incorporate a longitudinal component, prompting educators to analyze how learners progress and achieve competency benchmarks. This study investigated the developmental trajectories of ACGME Milestones ratings for anesthesiology residents using longitudinal data analysis. We also examined the longitudinal reliability of Milestones assessments and identified latent groups of different growth trajectories for each competency.

fulltextpubmed· Discussion· item 39869633

petency benchmarks. This study investigated the developmental trajectories of ACGME Milestones ratings for anesthesiology residents using longitudinal data analysis. We also examined the longitudinal reliability of Milestones assessments and identified latent groups of different growth trajectories for each competency. The longitudinal reliability of Milestones data, measured through growth rate reliability and growth curve reliability, was found to be moderate to high across the six core competencies and their subcompetencies. The growth rate reliability estimates ranged from 0.52 to 0.62, indicating a moderate capability to detect individual differences in the growth of latent competencies. The average growth curve reliability estimate was 0.71, demonstrating acceptable reliability for longitudinal assessments.8 The higher growth curve reliability for earlier assessments highlights the importance of early identification and intervention for residents who may need additional support.18,19 Instead of using Milestones data to solely focus on remediation, an alternative approach would be to identify early intervention opportunities, or inflection points, in residents’ growth trajectories. This would allow programs to provide new learning opportunities, practice, and coaching in line with the principles of mastery learning. By reframing inflection points as opportunities for program improvement, faculty can collaborate with residents to address curricular gaps and enhance educational strategies.20 This approach not only supports at-risk learners but also helps identify when high-performing residents are ready for new challenges. These findings suggest that the Milestones reporting system provides reliable data for tracking individual progress over time.21

fulltextpubmed· Discussion· item 39869633

ess curricular gaps and enhance educational strategies.20 This approach not only supports at-risk learners but also helps identify when high-performing residents are ready for new challenges. These findings suggest that the Milestones reporting system provides reliable data for tracking individual progress over time.21 Our study found that the rate of straight-lining among final year residents was 26.0%. Straight-lining highlights potential issues in evaluation tools, rater biases, and the challenge of capturing nuanced skill development in a single score. Nationally, the ACGME is investigating whether this effect represents a true lack of variation in competence or whether clinical competency committees are providing similar ratings across subcompetencies due to other factors. These factors could include the belief that the Milestones do not align with their local context or difficulties in providing valid, defensible ratings due to resource or assessment challenges within the program. If a program experiences high rates of straight-lining, it is crucial for the program director and clinical competency committee to reflect on whether these scores accurately represent the residents’ and fellows’ performance. While most specialties have transitioned to Milestones 2.0, this shift has not significantly reduced the overall occurrence of straight-lining. In response to feedback and research, the Milestones language has been revised to be more accessible for program directors and has been harmonized across specialties, particularly for competencies like professionalism, interpersonal and communication skills, system-based practice, and practice-based learning and improvement. Despite these revisions, recent data suggest that straight-lining remains an issue. For example, in anesthesiology, straight-lining is still observed at a rate of 27.4%.22

fulltextpubmed· Discussion· item 39869633

y for competencies like professionalism, interpersonal and communication skills, system-based practice, and practice-based learning and improvement. Despite these revisions, recent data suggest that straight-lining remains an issue. For example, in anesthesiology, straight-lining is still observed at a rate of 27.4%.22 Significant variability was observed at both the program and learner levels. The variance in baseline ratings was substantial compared to the rate of growth. This indicates that while programs vary significantly in their initial ratings, the rate of competency development is relatively consistent across programs.8 Program-level variance was notably greater than learner-level variance, with 69% of the baseline Milestones ratings variance attributable to programs compared to just 20% to learners. This finding is consistent with previous literature in other specialties, in which significant program-level variance has also been documented. For instance, in emergency medicine, programs accounted for 70% of the total variance in first-year Milestones ratings, while learner-level variance was reported at 23%.23 Similarly, in pediatrics, program-level variances of 54% and 68% were found, while learner-level variance was reported at 22% and 14% in two different conditions.24 This finding suggested that it is essential to account for program-level effects in the analysis of Milestones data. The variability observed can be attributed to several factors, including differences in the average true latent competency across programs or potential disparities in how programs conceptualize and evaluate competence. Some programs may lack a shared mental model for understanding and applying the Milestones framework, leading to inconsistencies in ratings. Qualitative studies have shown considerable variation in how clinical competency committees function and arrive at Milestones evaluations.25,26 Other contributing factors are that some programs may select residents who start with higher competence levels, or programs may assign residents to more challenging rotations.24

fulltextpubmed· Discussion· item 39869633

ualitative studies have shown considerable variation in how clinical competency committees function and arrive at Milestones evaluations.25,26 Other contributing factors are that some programs may select residents who start with higher competence levels, or programs may assign residents to more challenging rotations.24 Learner-level variance, although smaller than program-level variance, was also significant, highlighting individual differences in baseline ratings and growth trajectories. We also noted that variation at the program and learner levels tended to decrease over time. This decrease is likely due to learner plateaus or ceiling effects of residents as they advance through their training. It may also reflect increased standardization and greater consistency over time in the assessment of residents’ competence. The latent class growth analyses revealed multiple latent groups of learners with distinct learning trajectories for each subcompetency. Most subcompetencies exhibited three or four distinct groups, with some learners not achieving the target level 4 by graduation. These groups displayed varying developmental patterns: some showed rapid initial improvement followed by a plateau, others demonstrated linear progression, and a few exhibited slow but steady improvement.

fulltextpubmed· Discussion· item 39869633

tencies exhibited three or four distinct groups, with some learners not achieving the target level 4 by graduation. These groups displayed varying developmental patterns: some showed rapid initial improvement followed by a plateau, others demonstrated linear progression, and a few exhibited slow but steady improvement. We identified 3 to 4 latent groups of learners with distinct learning trajectories across the 25 subcompetencies, with 18 subcompetencies (notably medical knowledge 1, practice-based learning and improvement 1, and professionalism 3) having a group that does not achieve level 4 upon graduation. For those subcompetencies, the learning trajectory of at-risk learners diverged from other groups starting from either the middle of postgraduate year 2 or the end of postgraduate year 2, suggesting a critical period for identifying residents who may benefit from additional support or tailored educational resources is at the transition from intern to residency training (i.e., postgraduate year 2/clinical anesthesia year 1). Additionally, high-performing residents tended to plateau by the end of year 3, suggesting that this point may indicate when they are prepared for new challenges. Notably, the latent groups identified in our anesthesiology data were similar across subcompetencies. In contrast, latent groups in family medicine showed considerable differences across subcompetencies.8 These findings align with previous research, further validating the use of the Milestones assessment system for understanding and enhancing resident learning.27 These data can be leveraged to develop learner analytics systems that provide valuable insights into resident progress. Trainees, armed with this knowledge, can proactively seek out new and diverse learning opportunities within their programs to address any identified gaps in their development.20 The observed trajectories likely reflect multifaceted influences, including clinical exposure, educational methods, feedback mechanisms, and personal factors. For instance, residents in struggling groups may experience a lack of clinical exposure, effective mentoring, enhanced feedback mechanisms, or structured remediations. These speculations emphasized the role of contextual factors in shaping competency development.

fulltextpubmed· Discussion· item 39869633

onal methods, feedback mechanisms, and personal factors. For instance, residents in struggling groups may experience a lack of clinical exposure, effective mentoring, enhanced feedback mechanisms, or structured remediations. These speculations emphasized the role of contextual factors in shaping competency development. We recognize that the trajectories identified in this study may be influenced by programmatic interventions aimed at improving residents’ performance. It is important to note that residents with low Milestones ratings may already be receiving support from their programs, which could alter their Milestone trajectories. This interaction between programmatic input and individual growth patterns complicates the interpretation of the data and has not been addressed in the article.

fulltextpubmed· Discussion· item 39869633

ortant to note that residents with low Milestones ratings may already be receiving support from their programs, which could alter their Milestone trajectories. This interaction between programmatic input and individual growth patterns complicates the interpretation of the data and has not been addressed in the article. This study has limitations. First, the retrospective design relies on existing data, which may not capture the full complexity of resident development. The fact that some residents did not get to level 4 at graduation needs further clarification. Second, the study focused on anesthesiology residents, and the findings may not be generalizable to other specialties. Third, the use of self-reported Milestones data may be subject to bias in assessment ratings based on sex, race, and ethnicity. We also acknowledge that demographic shifts in the resident population may limit the generalizability of our findings to future cohorts. However, by presenting demographic characteristics and analyzing trends within this specific cohort, we provide a valuable reference point. Future research can build on our work by examining how changes in demographic composition affect learning patterns and Milestones progression under the new framework. Future research should address these limitations by using a prospective design, including multiple specialties, and exploring alternative assessment methods. Aligning these findings with the emerging data on Milestones 2.0 remains an area for future work.28 While our study identifies latent groups displaying different learning trajectories among residents, it is important to note that the underlying causes for these variations remain speculative at this stage. Further research is needed to explore the complex interplay of clinical experience, educational environment, personal factors, feedback practices, programmatic interventions, and other variables that may contribute to the observed variability in resident learning trajectories.

fulltextpubmed· Discussion· item 39869633

riations remain speculative at this stage. Further research is needed to explore the complex interplay of clinical experience, educational environment, personal factors, feedback practices, programmatic interventions, and other variables that may contribute to the observed variability in resident learning trajectories. This study demonstrates the longitudinal reliability of ACGME Milestones data and highlights the heterogeneity in learning trajectories among anesthesiology residents. The findings underscore the importance of using a programmatic assessment going beyond simply identifying individuals who may be struggling. Programs can use these data to identify learners with slower or faster growth in specific subcompetencies and implement tailored interventions, such as mentorship, workshops, or enhanced clinical exposure (e.g., a subset of learners consistently exhibits slower progression in a specific subcompetency, such as professionalism). Additionally, programs excelling in certain areas can share the best practices to improve training nationally, demonstrating the value of Milestones in guiding individualized education and program improvement. These patterns can also inform the timing and nature of targeted interventions for both residents and programs. Programmatic assessment is a comprehensive approach that optimizes learning, decision-making, and curriculum quality.29 It uses carefully chosen assessments as data points, with feedback to learners being key to maximizing their value. Further research is needed to explore the factors that influence learning trajectories and to develop interventions that can effectively address the diverse needs of residents.

fulltextpubmed· Discussion· item 39869633

, and curriculum quality.29 It uses carefully chosen assessments as data points, with feedback to learners being key to maximizing their value. Further research is needed to explore the factors that influence learning trajectories and to develop interventions that can effectively address the diverse needs of residents. Support was provided solely from institutional and/or departmental sources. The authors declare no competing interests.

fulltextpubmed· Longitudinal Reliability of Milestones· item 39869633

The longitudinal reliability of Milestones data, measured through growth rate reliability and growth curve reliability, was found to be moderate to high across the six core competencies and their subcompetencies. The growth rate reliability estimates ranged from 0.52 to 0.62, indicating a moderate capability to detect individual differences in the growth of latent competencies. The average growth curve reliability estimate was 0.71, demonstrating acceptable reliability for longitudinal assessments.8 The higher growth curve reliability for earlier assessments highlights the importance of early identification and intervention for residents who may need additional support.18,19 Instead of using Milestones data to solely focus on remediation, an alternative approach would be to identify early intervention opportunities, or inflection points, in residents’ growth trajectories. This would allow programs to provide new learning opportunities, practice, and coaching in line with the principles of mastery learning. By reframing inflection points as opportunities for program improvement, faculty can collaborate with residents to address curricular gaps and enhance educational strategies.20 This approach not only supports at-risk learners but also helps identify when high-performing residents are ready for new challenges. These findings suggest that the Milestones reporting system provides reliable data for tracking individual progress over time.21

fulltextpubmed· Program-level and Learner-level Variability· item 39869633

Significant variability was observed at both the program and learner levels. The variance in baseline ratings was substantial compared to the rate of growth. This indicates that while programs vary significantly in their initial ratings, the rate of competency development is relatively consistent across programs.8 Program-level variance was notably greater than learner-level variance, with 69% of the baseline Milestones ratings variance attributable to programs compared to just 20% to learners. This finding is consistent with previous literature in other specialties, in which significant program-level variance has also been documented. For instance, in emergency medicine, programs accounted for 70% of the total variance in first-year Milestones ratings, while learner-level variance was reported at 23%.23 Similarly, in pediatrics, program-level variances of 54% and 68% were found, while learner-level variance was reported at 22% and 14% in two different conditions.24 This finding suggested that it is essential to account for program-level effects in the analysis of Milestones data. The variability observed can be attributed to several factors, including differences in the average true latent competency across programs or potential disparities in how programs conceptualize and evaluate competence. Some programs may lack a shared mental model for understanding and applying the Milestones framework, leading to inconsistencies in ratings. Qualitative studies have shown considerable variation in how clinical competency committees function and arrive at Milestones evaluations.25,26 Other contributing factors are that some programs may select residents who start with higher competence levels, or programs may assign residents to more challenging rotations.24

fulltextpubmed· Program-level and Learner-level Variability· item 39869633

ualitative studies have shown considerable variation in how clinical competency committees function and arrive at Milestones evaluations.25,26 Other contributing factors are that some programs may select residents who start with higher competence levels, or programs may assign residents to more challenging rotations.24 Learner-level variance, although smaller than program-level variance, was also significant, highlighting individual differences in baseline ratings and growth trajectories. We also noted that variation at the program and learner levels tended to decrease over time. This decrease is likely due to learner plateaus or ceiling effects of residents as they advance through their training. It may also reflect increased standardization and greater consistency over time in the assessment of residents’ competence.

fulltextpubmed· Latent Groups of Learning Trajectories· item 39869633

The latent class growth analyses revealed multiple latent groups of learners with distinct learning trajectories for each subcompetency. Most subcompetencies exhibited three or four distinct groups, with some learners not achieving the target level 4 by graduation. These groups displayed varying developmental patterns: some showed rapid initial improvement followed by a plateau, others demonstrated linear progression, and a few exhibited slow but steady improvement.

fulltextpubmed· Limitations· item 39869633

This study has limitations. First, the retrospective design relies on existing data, which may not capture the full complexity of resident development. The fact that some residents did not get to level 4 at graduation needs further clarification. Second, the study focused on anesthesiology residents, and the findings may not be generalizable to other specialties. Third, the use of self-reported Milestones data may be subject to bias in assessment ratings based on sex, race, and ethnicity. We also acknowledge that demographic shifts in the resident population may limit the generalizability of our findings to future cohorts. However, by presenting demographic characteristics and analyzing trends within this specific cohort, we provide a valuable reference point. Future research can build on our work by examining how changes in demographic composition affect learning patterns and Milestones progression under the new framework. Future research should address these limitations by using a prospective design, including multiple specialties, and exploring alternative assessment methods. Aligning these findings with the emerging data on Milestones 2.0 remains an area for future work.28 While our study identifies latent groups displaying different learning trajectories among residents, it is important to note that the underlying causes for these variations remain speculative at this stage. Further research is needed to explore the complex interplay of clinical experience, educational environment, personal factors, feedback practices, programmatic interventions, and other variables that may contribute to the observed variability in resident learning trajectories.

fulltextpubmed· Conclusions· item 39869633

This study demonstrates the longitudinal reliability of ACGME Milestones data and highlights the heterogeneity in learning trajectories among anesthesiology residents. The findings underscore the importance of using a programmatic assessment going beyond simply identifying individuals who may be struggling. Programs can use these data to identify learners with slower or faster growth in specific subcompetencies and implement tailored interventions, such as mentorship, workshops, or enhanced clinical exposure (e.g., a subset of learners consistently exhibits slower progression in a specific subcompetency, such as professionalism). Additionally, programs excelling in certain areas can share the best practices to improve training nationally, demonstrating the value of Milestones in guiding individualized education and program improvement. These patterns can also inform the timing and nature of targeted interventions for both residents and programs. Programmatic assessment is a comprehensive approach that optimizes learning, decision-making, and curriculum quality.29 It uses carefully chosen assessments as data points, with feedback to learners being key to maximizing their value. Further research is needed to explore the factors that influence learning trajectories and to develop interventions that can effectively address the diverse needs of residents.

fulltextpubmed· Supplemental Digital Content· item 39869633

Supplemental Figure 1. Box plots of Milestone levels by reporting period, https://links.lww.com/ALN/D839 Supplemental Figure 2. Growth trajectories of latent groups, https://links.lww.com/ALN/D840 Supplemental Table 1. Quadratic growth curve analysis, https://links.lww.com/ALN/D841