Abstract
Background: Orthopaedic surgery is a male-dominated
field. As of 1998, women accounted for 42% of medical school graduates,
yet only 6.9% of the total number of orthopaedic residents were
female. The purpose of our study was to determine whether the Electronic
Residency Application Service charts of female candidates for orthopaedic
residencies are ranked lower by faculty reviewers than are those
of male candidates with similar qualifications.
Methods: After we obtained permission from the applicants, the
Electronic Residency Application Service applications submitted
by ninety male and ten female candidates for admission to a university orthopaedic
residency program for the 1998 National Residency Matching Program
were randomly divided into ten groups, consisting of the charts
of nine male candidates and one female candidate. Each chart from
a female candidate was altered into a "male" version, in which all
names and personal pronouns were changed but which was otherwise
identical to the original female version. Therefore, each group
of ten charts existed as a paired set: one containing the true female
chart and one, the altered "male" chart. The paired sets acted as
their own control. One hundred and twenty-one faculty reviewers
from fourteen orthopaedic residency programs around the United States
each reviewed either the "male" or the female version of one set,
without knowledge of the goals of the study, and ranked the ten
charts in the order in which they would like to have the candidates
as residents in their own programs. Each version of the sets was
reviewed by at least five separate reviewers. Reviewers at a given
institution were randomized to review different sets, so that there
was no overlap among them. The rankings of the female-"male" pairs
were compared with use of a standard paired t test.
Results: No significant difference was detected
in the rankings of the female and "male" charts (p = 0.5). The mean
difference in rankings was -0.33, with a 95% confidence interval
ranging from -1.41 (favoring females) to 0.74 (favoring "males").
Conclusions: The low percentage of female residents
is not due to bias against female applicants in the initial chart-review
phase of the orthopaedic residency selection process. It is possible
that bias is introduced in other stages of the selection process,
such as the interview.
Orthopaedics is a male-dominated field. As of 1998, women
accounted for 42% of medical school graduates, but only 6.9% of
the total number of orthopaedic residents were female1. This percentage was lower than that
in any other medical subspecialty except cardiothoracic surgery (5.5%)1. To put this in further perspective,
it is useful to consider the percentages of female residents in other
surgical subspecialties (Table ITable I). More than 60% of the residents
in obstetrics and gynecology and 20% of the residents in general
surgery are women (Table ITable I). Interestingly, even urology
and neurosurgery, often considered to be fields relatively inhospitable to
women, have higher percentages of female residents than does orthopaedics1.
Several theories account for this trend. Only about 10% of the
applicants to orthopaedic residencies are women2.
This figure indicates that women do not develop an interest in the
field either prior to or during medical school. It may reflect a
simple lack of exposure to the field, as the curriculum in many
medical schools does not include a formal didactic block on musculoskeletal
medicine and clinical rotations on the orthopaedic service are generally
brief and elective3,4.
This lack of exposure to orthopaedics can lead to a misunderstanding
of what the field encompasses and what it entails for its practitioners4. Medical students may erroneously
believe that all of orthopaedics is sports medicine or that skill
or interest in athletics, mechanics, or carpentry is a prerequisite
to entrance to the field. There also may be a generalized misconception
that orthopaedics requires great physical size, strength, and stamina5.
Finally, female medical students interested in orthopaedics perceive
a lack of mentors and peers in the field4.
If they are to pursue orthopaedic careers, they must feel comfortable
with the potential of being the only woman, or one of few women,
in a department.
The present study focused on women who made the choice to apply
to an orthopaedic residency program. They represent a visible minority
in the applicant pool. In the 2000 National Residency Matching Program,
there were 1116 applicants for 554 postgraduate year-one orthopaedic
positions. Of those, 1004 were male, 108 were female, and four did
not report their gender. In addition, 878 of them were graduates
of medical schools in the United States. Since 93% of the available
positions were filled by graduates of schools in the United States,
the effective ratio of applicants to positions6 was
1.58:1. Interestingly, neither the Electronic Residency Application
Service nor the National Residency Matching Program keeps statistics
on the breakdown of successful applicants by gender6, so we were unable to obtain data
on the percentages of male and female applicants that ultimately matched.
Also, data on applicant gender were not available for the National
Residency Matching Program for 1998, the year of the current study6.
The purpose of our study was to determine whether there is a
bias against women in the initial review of applications. In other
words, we sought to ascertain whether female candidates for orthopaedic
residency programs are ranked equally with male candidates who have
similar qualifications as stated on the standardized application
currently in widespread use. Our hypothesis was that female candidates
for orthopaedic residency programs are ranked lower than their male
counterparts.
The study was designed with assistance from the University of
Chicago Department of Health Studies and was approved by the University
of Chicago Institutional Review Board. The study was performed after
the 1998 National Residency Matching Program had concluded. With
use of a permission form sent by e-mail, we asked all 397 applicants
to the University of Chicago orthopaedic residency program in the
1998 National Residency Matching Program for permission to use their
Electronic Residency Application Service charts. Thirty-three applicants
(8.3%) were women. The charts of ninety male and ten female applicants were
randomly chosen from the total of 105 (ninety-three men and twelve
women) who had given permission. These 100 charts then were randomly
divided into ten groups, each consisting of the charts of nine male
applicants and one female applicant.
All charts consisted of the Electronic Residency Application
Service cover page, a medical school transcript, a personal statement,
the United States Medical Licensing Examination Step-I score, and two
letters of recommendation. The applicants' names remained in the
chart, but other identifying or personal information, such as social
security number, address, and telephone number, were removed. The
signatures of the individuals who wrote letters of recommendation
also were removed, so that the content of the letters, and not their
source, was the only factor considered in the evaluation.
The chart of each female applicant was digitized and computer-manipulated
into a "male" version, in which all names and personal pronouns
were changed but which was otherwise identical to the original female
version. Thus, each group of ten charts existed as a paired set:
one containing the true female chart and one, the altered "male"
chart. The paired sets acted as their own control.
The chairmen of nineteen large orthopaedic residency programs
around the United States were then solicited by the senior author
for the participation of their faculty as chart reviewers in the
study. The University of Chicago was not included. For the purposes
of randomization and statistical analysis, it was necessary for
each institution to commit ten faculty reviewers. Thus, the major
factor in the decision to approach a particular chairman was the number
of faculty members in his department. It was also necessary for
faculty reviewers to remain unaware of the nature, design, and goals
of the study. Therefore, although the chairmen received a protocol
and written description of the study design, they were asked not
to share that information with their faculty. The reviewers were
told simply that the study concerned the factors that contribute
to the formation of an institutional match-rank list, and the chairmen
were instructed not to provide them with any additional information.
Several potential coauthors of the study thought that the study
design was unethical in that it necessitated deception of the reviewers.
However, as noted above, the study was approved by the Institutional
Review Board of the University of Chicago, and no chairman declined
to volunteer his faculty members because of ethical concerns.
One chairman declined to participate because "gender selection
has not been an issue" at his institution, and two never answered
the original inquiry. Two institutions were unable to supply ten
reviewers, but they were held in reserve in case an incomplete response
was received from other institutions. Fourteen institutions originally
agreed to participate, but one later dropped out because the necessary
time commitment proved too onerous for the faculty. One reserve
institution was then recruited in its place; thus, fourteen institutions
participated in the study. This represents a compliance rate of 74%.
Prior to recruitment of the reviewers, a power analysis was performed
to determine the minimum number of reviewers required. For 90% power
to detect a difference of 2 in the rankings, which were based on
a scale of 1 to 10, four reviewers were needed for each male or
female version of each set; thus, a total of eighty reviewers (10
¥ 4 ¥ 2) were required. If rankings were assumed to be uniformly distributed
over a 5-point range, then five reviewers for each male or female
version of each set (a total of 100 reviewers) would provide 90%
power to detect a difference of 1. Since fourteen institutions indicated
a willingness to participate, there were potentially seven reviewers
available for each gender version of each set, thereby providing
additional statistical power.
Each institution was randomized, with use of a standard randomization
table, to receive either the "male" or the female version of each
set of charts. Faculty members at a given institution each reviewed
different sets so that there was no overlap among them and no intra-institutional
comparisons of the charts could be made.
The chairmen each received a box containing ten packets, and
they were asked to distribute one packet to each participating faculty
member. The packets each contained ten charts, a ranking sheet backed
with a demographic questionnaire, and written instructions. The
reviewers were asked simply to read the charts and, on the basis
of the information in them, to rank the candidates on a scale of 1
to 10, with 1 being the best, in the order in which they would like
to have the candidates as residents in their own program. The reviewers
were asked to complete the task in one sitting, at their convenience,
within a two-month period. Each reviewer also was asked to complete
a brief multiple-choice questionnaire so that we could compile demographic
data about the reviewers as a group. Completed ranking sheets and
questionnaires were given back to the chairmen, who returned them
in the envelope provided.
The reviewers and chairmen were assured that each individual's
participation in the study would be anonymous. At no time did the
investigators ask for the reviewers' names. In addition, the randomization
key was held by an outside party so that the investigators remained
unaware of which packets were sent to which institutions. Furthermore,
the ranking sheets were returned to an outside party so that the
investigators did not know from which institution the individual
responses had come.
The rankings of the female-"male" pairs were compared with use
of a standard paired t test. A statistical analysis also was performed
to determine whether the demographic characteristics of the reviewers
played a role in the ranking of the candidates.
One hundred and twenty-one (86%) of the 140 reviewers completed
the chart rankings, and 111 (79%) completed the demographic questionnaire. The
results of the demographic analysis showed that the reviewers were
predominantly young white men: 91% were white, 97% were male, and
71% were fifty years old or younger (Table IITable II). In addition, 72% of the
reviewers ranked at or below the associate professor level.
The ranking of the results for the ten paired sets of charts
are given in Table IIITable III. Each chart was reviewed
by five, six, or seven reviewers. The mean ranking (and standard
deviation) was 5.10 ± 2.01 for the ten female
charts as a group and 5.43 ± 2.39 for the
ten "male" charts, yielding a mean female-male difference of -0.33 ± 1.51, with a 95% confidence interval ranging from -1.41
(favoring females) to 0.74 (favoring males). These data were analyzed
with use of a paired t test over the ten sets. The difference between
the rankings of the female and "male" charts was not found to be
significant (p = 0.5).
Even in the cases of the two individual sets of charts with the
greatest spread between the rankings of the female and "male" versions
(candidates B and H, Table IIITable III), the differences did
not achieve significance (p = 0.096 for candidate B, and p = 0.067
for candidate H). Interestingly, in both cases, the female chart ranked
better than the "male" chart.
When the data were explored further, an ordered logistic model
revealed an association between the rank given to the female versions
of the charts and the age-group of the reviewer (likelihood ratio:
chi-square test, 12.05, p = 0.007). Compared with the reviewers
in the forty-one to fifty-year age-group, all other age-groups had
higher odds of giving the female candidates higher numerical ranks-that
is, worse rankings. Female applicants were more likely to be ranked
well by reviewers in the forty-one to fifty-year age-group than
by reviewers in other age-groups.
An analysis of variance of the model including both the age-group
of the reviewer and an indicator for each female student yielded
a p value of 0.12. Although this was not significant, it appeared
that there may be some ranking differences due to the age of the
reviewer after adjustment for the candidate differences. From the
coefficients of the analysis of variance model, we concluded that,
even after adjusting for the candidate differences, the reviewers
in the forty-one to fifty-year age-group were somewhat more likely
to give the female candidates a better ranking than were the reviewers
in other age-groups. A similar analysis indicated no significant
association between the academic rank of the reviewer and the rank
given to the female applicant (p = 0.45) or between the orthopaedic
subspecialty of the reviewer and the rank given to the female applicant
(p = 0.93).
Women are underrepresented in orthopaedic residency programs.
Since almost 50% of current medical school graduates are women,
the small number of women among orthopaedic residency recruits is tantamount
to allowing half of the available applicant pool to be ignored7.
Numerous studies have been done on factors involved in residency
recruitment8-20. The introduction
of the Electronic Residency Application Service system has streamlined
the process, and it allows for simpler comparisons among candidates,
as their credentials are presented in a uniform fashion16,21. In general, it has been found
that the more competitive a specialty is, the more heavily reviewers rely
on objective criteria, such as United States Medical Licensing Examination
scores, class rank, and Alpha Omega Alpha standing, in an initial chart
review19. In addition, surgical
specialties weigh objective criteria more heavily than do nonsurgical
specialties16. The interview process
is important, but it may be evaluated in one of two ways: either
as independent from the chart review (in other words, the playing field
is leveled among the candidates invited to interview) or as one
component of a rating system that incorporates both the chart review
and the interview8-11,16,17,19,20.
Furthermore, a favorable chart review is a prerequisite to obtaining
an invitation to interview16,22-26.
No applicant becomes an orthopaedic resident without interviewing;
therefore, the initial chart review is critical.
Orthopaedics, a highly competitive surgical subspecialty (99.3%
of available openings are filled in the match), tends to put a high
value on objective criteria19.
Assuming that to be the case, we hypothesized that differences in
ranking among candidates with the same objective credentials would
reflect bias toward or against the candidates' subjective attributes.
Our hypothesis was that female candidates would be ranked less well
than their male counterparts, reflecting subtle or perhaps unconscious
doubts about their suitability for the field and the reviewers'
desire to maintain the status quo.
However, our hypothesis was not borne out. The rankings of female
and "male" applicants were essentially the same, with an insignificant
trend in favor of the women. Taken as a group, the ten candidates
received a mean rank of about 5 of 10. The mean rankings of the
individual candidates ranged from 1.2 to 8. The fact that there
was a range of rankings, and that the ranks of the female and "male"
charts of the pairs tended to agree closely, indicates that reviewers
did not sense anything "suspect" in the altered charts.
In designing the study, we chose to focus on one variable, gender,
in order to generate sufficient power while utilizing a manageable
number of charts and reviewers. Although simultaneously testing
another variable-for example, race or ethnicity-would have been
interesting, we did not consider it feasible because of the increased
logistical and statistical complexity that this would have caused.
Similarly, as an additional control, we considered simultaneously
altering the charts of ten male applicants into the charts of "female"
applicants and testing them as well, but we would have needed to
double our reviewer pool in order to ensure that there was no intra-institutional
overlap of charts among reviewers, while maintaining a ratio of
one chart in ten from a female candidate. This ratio represents
the makeup of the actual candidate pool, and we believed that exceeding
it might have raised reviewers' suspicions about the purpose of
the study.
The demographic characteristics of the reviewers were essentially
as predicted, with a predominance of white men. The slight association
between reviewer age and ranking, with reviewers between forty-one
and fifty years old more likely to rank female candidates well,
was surprising. One explanation for this finding is that reviewers
in this age-group are likely to have daughters of an age to be participating
in higher education and job competition and have therefore had their
"consciousness raised."
There are several possible problems with our study. We had no
way to ascertain that the reviewers remained blinded to the purpose,
nature, and goals of the study. Although precautions were taken,
such as repeatedly reiterating to program chairmen not to divulge
information, maintaining a realistic percentage of charts of female
candidates in the pool, and not divulging to reviewers that the
principal investigator for the study was female, it is certainly possible
that some or all of the reviewers either were told or surmised the
nature of the study.
It is also possible that some or all of the reviewers recognized
or knew candidates whose charts they were asked to review. There
were two ranking sheets on which reviewers had indicated that they did,
in fact, personally know candidates whom they were asked to rank.
In both cases, the reviewers had declined to rank those particular
candidates. Also, in both cases, the known candidates were not test subjects
but were simply one of the nine male applicants with standard charts.
Also, there may have been some selection bias inherent in the
participation of orthopaedic departments in the study. The nineteen
program chairmen initially contacted were chosen primarily on the basis
of the size of their programs but also, in two cases, on the basis
of the senior author's personal acquaintance with them and his perception
of their interest in the issue of diversity in orthopaedics. However,
the views of a chairman do not necessarily reflect those of individual
faculty members. Moreover, the attitudes of faculty at smaller,
non-university, or military residency programs may differ from those
of our sample.
Furthermore, our study addressed only the initial step of the
resident recruitment process-that is, chart review. Clearly, the
personal interview is also important, and once a candidate is invited
for the interview, female gender may become either a liability or
an asset. We were unable to devise a simple, practical way to test
this, as a traditional face-to-face interview cannot easily be blinded.
However, in a competitive residency such as orthopaedics, simply
obtaining the interview is crucial because the majority of the candidates
graduating from United States medical schools who are interviewed
are eventually matched with a residency program. In fact, several
studies16,22-26 have demonstrated
that the invitation to interview (and not the interview itself)
constitutes the most critical aspect of the selection process, and
these invitations are issued primarily on the basis of review of
the Electronic Residency Application Service chart.
Finally, the question of female residents in orthopaedic programs
may not ultimately come down to whether to accept women in general,
or a specific woman in particular, but, rather, to what constitutes a
"critical mass" of female residents that program faculty feel comfortable
with. In other words, although members of a department may feel
comfortable admitting one female resident per year, would they feel
comfortable if half of the residents were women? What about three-quarters?27 Our study cannot gauge the effect,
if any, that such a bias would have.
Our study showed that, during the initial evaluation of Electronic
Residency Application Service charts, there was no reviewer bias
contributing to the lack of female orthopaedic residents. Other
factors account for the lack of female orthopaedic surgery residents,
and additional efforts should be made to familiarize female students
with the profession early so that they can position themselves as
viable residency candidates.
Note: The authors gratefully acknowledge the contributions of
Patrick Getty, MD, for assistance in conceptualizing the project;
Theodore Karrison, PhD, Maria-Antonia Robertson, PhD, and Xiling Liu,
MS, for help with the study design and statistics; Terri Smith for
logistical and administrative support; the participants in the Electronic
Residency Application Service Matching Program for allowing use
of their charts; and the chairmen and faculty of the following orthopaedic
residency programs for generously donating their time and effort: Duke
University, Harvard Combined Orthopaedic Residency Program, Loyola
University, Mayo Clinic, New York University/Hospital for Joint
Diseases, Northwestern University, University of Iowa, University
of Minnesota, University of Pennsylvania, University of Texas/San
Antonio, University of Texas/Southwestern Medical School, University
of Washington, Vanderbilt University, and Washington University.
Jolly P, Hudley DM, editors. AAMC
data book: statistical information related to medical education.
Washington, D.C.: Association of American Medical Colleges; 1999. Table
F9
Jolly P, Hudley DM, editors. AAMC
data book: statistical information related to medical education.
Washington, D.C.: Association of American Medical Colleges, 1998. Table
B7
Freedman KB, and Bernstein J: The adequacy of medical school education in musculoskeletal
medicine. J Bone Joint Surg Am,1998.80: 1421-7, 801421
1998
[PubMed]
Rogers C: Leading the way. AAOS Bull,1999.April 43-4, 43
1999
Mankin HJ: Diversity in orthopaedics. Clin Orthop,1999.362: 85-7, 36285
1999
[PubMed]
Husbands N. (Acting Director
of the Electronic Residency Application Service). Personal communication, 2000
Simon MA: Racial, ethnic, and gender diversity and the resident operative
experience. How can the American Orthopaedic Society shape the future
of orthopaedic surgery?. Clin Orthop,1999.360: 253-9, 360253
1999
[PubMed]
Aghababian R; Tandberg D; Iserson K; Martin M; and Sklar D: Selection of emergency medicine residents. Ann Emerg Med,1993.22: 1753-61, 221753
1993
[PubMed]
Curtis DJ; Riordan DD; Cruess DF; and Brower AC: Selecting radiology resident candidates. Invest Radiol,1989.24: 324-30, 24324
1989
[PubMed]
Galazka SS; Kikano GE; and Zyzanski S: Methods of recruiting and selecting residents for U.S.
family practice residencies. Acad Med,1994.69: 304-6, 69304
1994
[PubMed]
Marciani RD; Smith TA; and Kohn MW: Survey of resident-selection procedures for oral surgery
graduate programs. J Oral Surg,1976.34: 784-8, 34784
1976
[PubMed]
McNevin S, and Leichner P: Factors effecting the recruitment of residents: the residents'
and residency directors' view. Can J Psychiatry,1983.28: 449-52, 28449
1983
[PubMed]
Metheny WP; Ling FW; Holzman GB; and Mitchum MJ: Answers to applicant selection from a directory of residency
programs in obstetrics and gynecology. Obstet Gynecol.,1996.88: 133-6, 88133
1996
[PubMed]
Ross CA, and Leichner P.: Criteria for selecting residents: a reassessment. Can J Psychiatry,1984.29: 681-6, 29681
1984
[PubMed]
Sklar DP, and Tandberg D: The value of self-estimated scholastic standing in residency
selection. J Emerg Med,1995.13: 683-5, 13683
1995
[PubMed]
Taylor CA; Weinstein L; and Mayhew HE: The process of resident selection: a view from the residency
director's desk. Obstet Gynecol,1995.85: 299-303, 85299
1995
[PubMed]
Wagoner NE, and Gray GT: Report on a survey of program directors regarding selection
factors in graduate medical education. J Med Educ,1979.54: 445-52, 54445
1979
[PubMed]
Wagoner NE, and Suriano JR: Recommendations for changing the residency selection process
based on a survey of program directors. Acad Med,1992.67: 459-65, 67459
1992
[PubMed]
Wagoner NE, and Suriano JR: Program directors' responses to a survey on variables
used to select residents in a time of change. Acad Med,1999.74: 51-8, 7451
1999
[PubMed]
Wagoner NE; Suriano JR; and Stoner JA: Factors used by program directors to select residents. J Med Educ,1986.61: 10-21, 6110
1986
[PubMed]
Taylor CA, Mayhew HE, and Weinstein L: Residency directors' responses to the concept of a proposed
electronic residency application service. Acad Med,1994.69: 138-42, 69138
1994
[PubMed]
Baker JD 3rd; Bailey MK; Brahen NH; Conroy JM; Dorman BH; and Haynes GR: Selection of anesthesiology residents. Acad Med,1993.68: 161-3, 68161
1993
[PubMed]
Edwards JC; Currie ML; Wade TP; and Kaminski DL: Surgery resident selection and evaluation. A critical incident
study. Eval Health Prof,1993.16: 73-86, 1673
1993
[PubMed]
Fiedler IG, and Klingbeil G: A recruitment model for selecting residents. Acad Med,1991.66: 476-8, 66476
1991
[PubMed]
George JM; Young D; and Metz EN: Evaluating selected internship candidates and their subsequent
performance. Acad Med,1989.64: 480-2, 64480
1989
[PubMed]
Villanueva AM; Kaye D; Abdelhak SS; and Morahan PS: Comparing selection criteria of residency directors and
physicians' employers. Acad Med,1995.70: 261-71, 70261
1995
[PubMed]
Wagoner NE. Personal communication,
1999