Modeling Scientific Workforce Diversity

National Institute of General Medical Sciences
Natcher Conference Center, Room B
October 3, 2007


As part of its efforts to promote diversity in the biomedical workforce, NIGMS convened a group of scientists to evaluate the feasibility of creating a computer model of the scientific workforce as a guide for policy makers. A review of research findings on the participation of racial and ethnic minorities in science highlighted both lingering patterns of underrepresentation and change over time, and the group discussed how policy interventions and other social factors can have an impact on the degree of disparity. They concluded that current limitations in the state of knowledge about forces influencing science career decisions constrain efforts to build a comprehensive model of the factors shaping the scientific workforce. In the near term, it would be most useful to construct focused models to produce insight into career dynamics, help refine research questions, and encourage the collection of relevant data.

1. Background

The National Institutes of General Medical Sciences (NIGMS) is committed to developing a diverse biomedical workforce and is employing data-driven, scientifically rigorous tools to achieve this goal. To inform program development and stimulate progress toward a more inclusive biomedical workforce, NIGMS has recently expanded its efforts to encourage research into the efficacy of interventions designed to increase diversity.

On October 3, 2007, NIGMS convened a working group to assess the feasibility of computer modeling to guide policy makers in their efforts to increase workforce diversity. The working group consisted of experts in scientific workforce development; training, recruiting, and retaining minority students and faculty; sources of data; and computational modeling.

In his charge to the group at the start of the meeting, NIGMS Director Jeremy Berg identified the need to look beyond individual programs in isolation and pursue a systems-based analysis. Advances in computational methods and system sciences suggest that it may be possible to build useful models of complex social dynamics such as scientific workforce diversity. Recent endeavors, such as the Models of Infectious Disease Agent Study or MIDAS, have demonstrated the value of modeling complex social systems, and modeling tools are also being used by other components of the NIH to study other workforce issues such as the aging of the NIH grantee pool.

2. Data and Research Findings on Diversity in the Scientific Workforce

As a starting point for the discussion, and as a way of introducing the modeling experts to the basic parameters of the problem, social and biological scientists summarized recent research on the representation of racial and ethnic minority group members in science. Research findings on rates of participation at various levels of education and career development were presented.

In his overview of what is known about successful efforts to reduce educational disparities, Daryl Chubin described the multiplicity of factors that influence outcomes, ranging from individual attitudes and tastes to market factors such as job opportunities and pay scales. Some disciplines, like business and the social sciences, have made greater progress toward diversity than have other fields. Rigorous mathematics and science education at the middle school level, role models, mentors, and institutional commitments are critical to increasing the participation of underrepresented groups in science. Efforts to correct the problem can focus on the students, the educational institutions or the pathways between career stages, as there are losses at all key career transition points.

The data on education and career outcomes demonstrate a pressing need for action, but Anthony DePass pointed out that there are striking deficiencies in our knowledge of the underlying dynamics that produce the observed disparities. We know that some minority groups are still underrepresented at all stages of science education, but efforts to increase the number of minorities in science have been more successful at the baccalaureate and masters level than at the doctorate level. Clearly, some efforts are succeeding, and we need to know more about what is responsible for the both the encouraging and disappointing outcomes. Chair of the NIGMS-sponsored National Academies Workshop, "Understanding Interventions That Encourage Minorities to Pursue Research Careers," he noted that a significant amount of effort is expended in the evaluation of individual programs. Unfortunately, many evaluation projects have not contributed to systematic data collection, hypothesis-driven research, or accumulation of a solid body of research findings that can inform policy actions.

NIGMS Council Member Paula Stephan directed attention to some of the large scale social and economic forces that affect the size and composition of the scientific workforce. She demonstrated that the recent growth in the number of Ph.D.s awarded to members of underrepresented minority groups is closely related to the increasing number of science bachelor degrees earned by students from these populations. Graduate school enrollments are influenced by the strength of the economy and the availability of jobs, the size of fellowship stipends, and the length of training. Levels of student debt and the perception of career opportunities are also likely to influence recruitment and retention. Pointing to data on the declining probability of obtaining tenure-track positions and NIH funding, she cautioned that recruitment and retention efforts will be constrained by perceptions of limited career opportunities.

Several presentations demonstrated that scientists from underrepresented minority groups are less likely to be employed at the most highly ranked research institutions and are more likely to be employed at schools that emphasize undergraduate education. Joan Robinson identified institutional factors that constrain research careers at these schools, including heavy teaching loads and limited research infrastructure, and gave examples of some of the efforts that Morgan State University and other schools use to contend with them. Confronted with the significant challenge of competing with scientists from more affluent settings, she described how many faculty members at teaching institutions make important contributions to research through collaborations with researchers at other institutions.

Lisa Frehill identified some of the major data sources used to analyze workforce issues and diversity in the science. Her overview of major trends found persisting disparities in career attainment along with evidence of their reduction in some scientific fields and underrepresented populations. In recent years, more members of underrepresented minority groups have been obtaining postsecondary faculty positions, but this change has been taking place largely outside of the research universities that are most competitive for research grants.

At several points in the discussion panel members cautioned that modeling should not be confused with forecasting. Exogenous factors (e.g., social and economic events like the Vietnam War era draft, the emergence of the biotechnology industry and the “dot com” boom) cannot be predicted in advance but will have a large effect on career decisions.

In its specific concern for underrepresented minorities, NIGMS was advised to consider the general characteristics of the labor market. The biomedical research enterprise is largely staffed by graduate students and postdoctoral fellows and currently produces more trained scientists than can be employed in academic research. The length of training, in combination with the fierce competition for faculty positions and research funding, is seen as a major deterrent to the pursuit of academic research careers. In view of this situation, several working group members questioned whether NIH’s criteria for the success of its training programs should be expanded to include other options in addition to NIH-supported research. NIH Deputy Director Raynard Kington responded with his view that – while there are many ways to contribute to research – the goal of NIH-funded training is increased participation in major NIH research grant programs, and this objective should not be abandoned for underrepresented minorities.

3. Modeling

NIH is currently exploring modeling approaches in connection with other policy concerns. Walter Schaefer described a time series model created by NIH’s Budget Office to illustrate the aging of the scientific workforce. Using data from grant records, the model estimates the number of individuals in the grant system at various points in time and their likelihood of making the transition to the next stage.

Another modeling approach, agent-based modeling (ABM), was suggested by Ross Hammond. ABM models begin with a population of individuals (agents) who behave according to a set of known, plausible rules. Agents in an economic model, for example, might have rules based on income and employment, while agents in a disease model may make health decisions dependent on their age and gender. Studies of a wide range of complex social phenomena—including epidemics, civil violence, cigarette smoking, ethnocentrism, and retirement decision-making—have taken advantage of the flexibility of agent-based approaches.

Stephen Eubanks contrasted agent-based models with statistical or time series models, pointing out that one would need to know a great deal more about decision-making in order to build a complete dynamic model of the scientific workforce. He noted, however, that there are compelling reasons to consider a series of focused models to look at specific questions (debt, decision making, cultural context, or perceptions) for which the data requirements are less restrictive. The process of constructing such models will force the clarification of underlying assumptions, guide the collection of data, and encourage empirical testing of hypotheses. These focused models could then become the basis for the development of more comprehensive models over time.

The complexity of the decision making process is likely to be a constraint in developing models of workforce diversity, and Jack Muckstadt cautioned that the accurate modeling of any system requires a deep understanding of the parts, processes, and contexts. Modeling diversity of the scientific workforce is no different; the first step is to thoroughly describe the system at every relevant stage. This requires better understanding of the many levels of decision-making--from individual to institutional --and how decisions change across time and context. The model building process,, however, is very valuable because it will stimulate thinking about relationships, critical questions and information needs.

4. Conclusions

There is a substantial body of data demonstrating the degree to which certain demographic groups are underrepresented in the scientific workforce. In recent years, some disparities have been reduced, but the overall rate of progress has been disappointing and improvements have not been uniform across career stages, demographic groups and scientific fields. And despite a considerable body of data on outcomes, policy makers still lack a solid body of research on the dynamics underlying these patterns. Specifically, modelers need more systematic and generalizable data to determine the impacts of intervention programs, better information about what influences student and institutional decisions, and longitudinal studies to describe individual career trajectories. These data limitations and gaps in the research base, along with the complexity of the process, constrain efforts to develop a comprehensive model of the process in the near term. The working group urged NIGMS to work with the National Science Foundation and other agencies to overcome these barriers.

These concerns not withstanding, a concerted modeling effort would be valuable for program design and policy-making. Such an effort would direct attention to the underlying dynamics that produce successful scientists, help identify questions in need of additional research, and stimulate the collection of the data necessary to answer them.

The complexity of modeling the dynamics of the scientific workforce could be minimized by breaking the overall problem into a series of subsets and developing a suite of models. Different types of models could be used to provide insight into specific areas, and—in the long run—additional efforts could be mounted to build upon the initial work. Drawing upon existing data resources, a continuum of models could be developed, each targeting a particular stage education stage or transitions point (e.g., high school to undergraduate, undergraduate to graduation, etc.).

While the academic job market is clearly an important focus of NIH training policy, this sector should not be viewed in isolation. The broader demand for biomedical scientists will have important implications for individuals making decisions to pursue scientific careers, both in academia and in other settings. Decisions about NIH training will also have societal and economic effects that will extend far beyond the pool of potential NIH grantees. Policy-makers need to understand the full range of consequences of their actions. As a valuable adjunct to its efforts to model the participation of under-represented minorities, NIH should also conduct more comprehensive investigations of the career outcomes of its trainees. In addition, it should examine the broader social and economic returns to its investment in biomedical research and training to society, perhaps by constructing an economic model of the contributions of this research to society.

Coarse-grained models can be designed to look at the whole pipeline focusing on long-term dynamics. A time series model, for example, might provide insight into workforce changes over time, demonstrating how many years it would take to reach diversity goals under various assumptions. It could also be used to contrast the effect of changing transition rates from one stage to another with the effect of changing the size of the pool of potential entrants into the stage.

Fine-grained models could be designed to focus on career choices and risk. As part of this effort, a dialog involving researchers, model builders, and policy makers could identify critical research questions and data collection needs, and the model building would complement the NIGMS effort to encourage research into factors related to underrepresentation of minority groups in science (e.g., Research on Interventions that Promote Research Careers (R01) RFA-GM-08-005).

The working group was enthusiastic about using modeling methods to study scientific workforce diversity, but urged caution in regard to short-term expectations. They recommended that NIGMS: 1) work with NSF and other agencies to expand and improve the quality of data on scientific careers, 2) consider a process for identifying policy-relevant questions that are amenable to modeling given existing data, and 3) support more collection and analysis of data on current NIH training activities.


Attachment A: Committee Roster
Attachment B: Meeting Agenda
Attachment C: Data Sources Mentioned in Presentations
Attachment D: Links to Presentations and Data Resources

Attachment A: Committee Roster

Howard Garrison, chair
Federation of American Societies for Experimental Biology
9650 Rockville Pike
Bethesda, MD 20814
(301) 634-4650

Daryl Chubin
American Association for the Advancement of Science
1200 New York Avenue, NW
Washington, DC 20005
(202) 326-6785

Anthony DePass
Long Island University-Brooklyn
1 University Plaza
Brooklyn NY 11201
(718) 488-1487

Stephen Eubank
Virginia Bioinformatics Institute
1880 Pratt Dr.
Virginia Tech University
Blacksburg, VA 24061-0477
(540) 231-2504

Lisa Frehill
Commission on Professionals in Science and Technology
1200 New York Ave, NW
Washington, DC 20005
(202) 326-7080

Ross Hammond
Brookings Institution
1775 Massachusetts Avenue NW
Washington, DC 20036-2103
(202) 797-6000

Catherine Millett
Educational Testing Service
Rosedale Road
Princeton, NJ 08541
(609) 734-5866

John Muckstadt
School of Operations Research
and Industrial Engineering
Cornell University
Ithaca, NY 14853

Joan Robinson
Morgan State University
Room 217
Calloway Hall
Baltimore, MD 21251
(443) 885-3350

Paula Stephan
Georgia State University
Department of Economics
Andrew Young School of Policy Studies
PO Box 3992
Atlanta, GA 30302-3992
(404) 651-3988

Attachment B: Meeting Agenda

8:30 amWelcomeIrene Eckstrand
8:40 amCharge to the groupJeremy Berg
9:00 amGoals for this meetingHoward Garrison
What have we learned about the educational, social, cultural, and programmatic contexts of training URM students? What don’t we know?
9:15 amEducational and social context of URM trainingDaryl Chubin
9:35 amLessons learned from programs for URM scientistsTony Depass
What is the problem? What do we need to know to develop and implement better programs? What insights might modeling provide?
10:00 amSocial and economic perspectivePaula Stephan
10:20 amBREAK 
10:40 amInstitutional perspectiveJoan Robinson
What data are available? What are critical gaps in the data? What do we need and could it be collected?
11:00 amOverviewHoward Garrison
11:20 amCommission on Professionals in Science and TechnologyLisa Frehill
11:40 amEducational Testing ServiceCatherine Millett
What modeling approaches might be of use?
What could modeling contribute to investigating scientific workforce diversity?
1:00 pmPerspective from a network modelerStephen Eubank
1:20 pmPerspective from an agent based modelerRoss Hammond
1:40 pmPerspective from a logistic modelerJohn Muckstadt
2:00 pm Discussion, Conclusions, and Recommendations 
4:00 pmAdjourn 

Attachment C: Selected Sources of Data

NSF Survey of Doctorate Recipients
NSF Survey of Earned Doctorates
NSF Survey of Graduate Students and Postdocs in Science and Engineering
NSF Survey of Doctorate Recipients
NSF National Survey of College Graduates
Occupational Employment Statistics Survey
National Center for Education Statistics: Integrated Postsecondary Education Data System (IPEDS)
National Center for Education Statistics: National Study of Postsecondary Faculty
Nelson Diversity Surveys
Disciplinary Society and Organization Surveys

  • Computing Research Association (Taulbee Survey)
  • American Society for Engineering Education
  • American Mathematical Society
  • American Institute of Physics
  • American Chemical Society
  • Association of American Medical Colleges Faculty Roster

Attachment D: Links to Presentations (no longer available)