Evaluation Resources Frequently Asked Questions | National Institute of General Medical Sciences

The external links from this webpage provide additional information that is consistent with the intended purpose of this site. NIGMS cannot attest to the accuracy or accessibility of a nonfederal site.

NIGMS-funded training programs must conduct ongoing program evaluations to monitor the effectiveness of their training and mentoring activities in meeting the stated goals. Below, we’ve answered some common questions about evaluations and their role in NIGMS programs. The guidance on this page is not intended to apply to every type of evaluation need and is not a comprehensive list of all questions related to evaluation. Please email the appropriate NIGMS program contact if you have other questions.

Visit our Evaluation Resources page for a listing of more resources.

NIGMS did not create the content on the external sites linked below, isn't responsible for their availability or content, and does not endorse, warrant, or guarantee the products, services, or information described or offered at these sites.

Survey Development

What is the purpose of an evaluation?

Evaluation determines the extent to which a program has achieved its goals or outcomes. It may utilize an assessment as a tool to measure aspects of the evaluation or include research questions that the team wants to answer.

What are different types of evaluation?

Descriptions of various types of evaluation can be found on the CDC's Types of Evaluation website. You should choose the type that works best for your program.

Is there a "sweet spot" for the number of survey questions to get optimal response rates?

The optimal number of survey questions varies depending on the needs of the program or evaluation, and the time it takes to complete the survey varies based on the question types, e.g., yes/no, multiple choice, short answer, or open field. Aim to ask the number and type of questions needed for meaningful evaluation while keeping the length as short as possible to increase the chances participants will complete it.

Relevant Resources:

(Paid-for/subscription-based resource) Designing Quality Survey Questions, by Sheila Robinson. (2024). Los Angeles: SAGE, 2024.
Organizational Research: Determining Appropriate Sample Size in Survey Research. Bartlett II, JE, Kotrlik JW, Higgins CC. (2001). Information Technology, Learning, and Performance Journal, Vol. 19, No. 1.
Six Rules of Thumb for Determining Sample Size and Statistical Power. The Abdul Latif Jameel Poverty Action Lab (J-PAL). 2018.

Should incentives be provided for survey completion?

While incentives can improve response rates, large incentives could be considered coercive. It’s important to acknowledge participants’ time spent completing the survey without causing them to feel compelled to complete surveys due to monetary rewards they wouldn’t receive otherwise.

Some institutional review board (IRB) protocols include guidelines on survey incentives, so be sure to consider yours.

Relevant Resource:

Incentives for Survey Participation — When Are They "Coercive"?. Singer E, Bossarte RM. Am. J. Prev. Med. 2006;31(5),411–418. https://doi.org/10.1016/j.amepre.2006.07.013

How should evaluation teams prepare for an open response survey?

Open response, or open-ended, questions can unveil prominent themes and topics. Ensure that questions are written and designed clearly, keeping each question to a single topic without multiple parts. Open response surveys can take longer to analyze, but responses can be qualitatively coded either manually or using licensed software such as Nvivo and ATLAS.ti. Developing a codebook linking codes to the various metrics and constructs of a program may be helpful to the evaluation team.

Should common measures be used?

Common measures are important to use when comparing across cohorts and years and can be useful for looking at longitudinal or cross-site analyses. Using the same measures over time can help to measure progress in a more standardized way. However, evaluators should not feel the need to maintain common measures if they are outdated or no longer relevant (e.g., a question about a seminar that’s no longer offered).

The evaluation and implementation teams should work together to develop common measures before evaluation begins.

How do you work with small sample sizes?

Preparing for a small sample size during the planning phase can be helpful for various statistical analyses, reporting, and overall approaches to addressing imbalance. Before implementing a survey, consider the minimum sample sizes needed to answer questions of interest, and determine whether the survey population will yield the necessary number of responses.

Relevant Resource:

Small samples due to lower-than-planned enrollment in impact evaluations: what to do? Cole, R. Evaluation Technical Assistance Brief No. 5. Washington, DC: Children's Bureau, U.S. Department of Health and Human Services, 2020.

Rubrics and Metrics

Is there an example rubric our evaluation should follow?

NIGMS does not provide rubrics for evaluation because every program is unique in their goals and implementation techniques. Depending on the goals of the program, the measures and rubrics needed to aid the evaluation will vary.

A list of general evaluation tools can be found on the NIGMS evaluation resources webpage.

What are factors to consider when developing a scale?

Selecting appropriate survey scales is an important aspect of survey design. Different measures will require different scales, and using consistent scales throughout a survey may lead to more straightforward analyses. When possible, consider using validated measures and associated scales, such as the commonly used Likert-type scale. When constructing your own measures, discuss the scales and different options (e.g., including an N/A or neutral option, scales with odd numbers of responses) with your evaluator/evaluation team.

Relevant Resources:

Question and Questionnaire Design. Krosnick J, Presser S. 2009.
Likert-type Scale Response Anchors, Vagias, WM. (2006)
The Impact of "No Opinion" Response Options on Data Quality: Non-Attitude Reduction or an Invitation to Satisfice? Krosnick JA, Holbrook AL, Berent MK, Carson RT, Hanemann WM, Kopp RJ, Mitchell RC, Presser S, Ruud PA, Smith VK, Moody WR, Green MC, Conaway M. Public Opinion Quarterly, 2002;66(3):371–403. https://doi.org/10.1086/341394.

Personnel

Can an external evaluator be hired, and what’s considered "external"?

Yes, an external evaluator may be hired. While an evaluator external to the program may bring greater expertise and less bias to an evaluation, it’s not always necessary to hire one. Program teams may consider working with evaluators internal to their institution who are not affiliated with their program implementation; thus "external" to the program while remaining local to the institutional environment.

Consider the costs of hiring an evaluator and what may be allowable through your grant funds versus from university or institutional resources. Evaluation is considered an "allowable cost" for many grants, and funds within budget may be used to defray the cost of the evaluation. Institutional support is expected to contribute toward the cost of evaluation. Please consult the relevant notice of funding opportunity for guidelines and contact your grants management specialist and/or program director with questions.

How can an external evaluator understand the nuances of complex programs?

The program goals should be clearly defined and measurable so that the evaluator can effectively map the evaluation to them. Program teams should meet with the evaluator(s) to discuss program goals and topics of importance to ensure that they have a thorough understanding of the program.

Note: The evaluator/evaluation team MUST protect sensitive information.

Can evaluation be done entirely on-site?

Formative evaluations can be done on-site to gain data that can be used, for example, when applying for grants, conducting an institutional self-assessment, or refining program aspects. However, working with external evaluators* to assess program outcomes will help distance the evaluative findings and avoid potential bias (and the appearance of bias) in evaluations.

*This can be a person who is external to the program being evaluated while still being on-site. For example, staff from a different college at the same institution.

NIH/Program Expectations and Goals

What is the goal of program evaluation?

Evaluations help program directors and institutions measure progress toward their goals and find potential areas for improvement. Evaluation data is useful for reporting about program outcomes, recommended practices, or novel program ideas in future grant applications or manuscripts.

Because each program is unique and creates training and mentoring activities for different populations, their goals and associated evaluations will vary, making direct comparison between programs difficult.

Evaluations help program directors and institutions measure progress toward their goals, as well as find potential areas for improvement. In addition, including evaluation data is useful for reporting and when applying for future grant awards.

Data from evaluations may also be used in manuscripts if the program staff are interested in writing about their outcomes, sharing recommended practices, or novel program ideas.

Note: Programs must receive the proper clearance through their institutional review board (IRB) for evaluations, particularly for those that may result in the public release of data.

Can data can be used, shared, and compared across institutions?

Before planning to share data across institutions, program teams should become familiar with the institutional review board (IRB) policies at both their institution and partner/collaborating institutions with whom they want to share/compare data. Depending on local regulations and standards, the teams may be able to apply for a blanket IRB agreement among the participating institutions. If multiple institutions plan to share data, teams should develop guidelines for proper usage, storage, and access to the shared data through a Data Sharing Agreement (or similar document).

An NIH-funded study being conducted at more than one U.S. site involving non-exempt human subjects research may be subject to the NIH Single IRB policy and/or the revised Common Rule (rCR) cooperative research provision (§46.114). For more information, visit NIH’s site on Single IRB for Multi-Site or Cooperative Research.

Does NIH consider training program evaluation a form of Human Subjects Research?

No. Training grants prepare individuals for careers in the biomedical research workforce by developing and implementing evidence-informed educational practices including didactic, research, mentoring, and career development elements. While funded programs are expected to conduct ongoing program evaluations and assessments to monitor the effectiveness of the training and mentoring activities, training grants funds are not intended to support Human Subjects Research (see additional information on Human Subjects Research from NIH).

If an investigator wishes to conduct Human Subjects Research involving the trainees supported by the training program as research study participants, they must follow appropriate institutional policies (e.g., obtaining IRB approvals, consenting study participants).

Applicants are encouraged to reach out to Scientific/Research Contact listed in the funding announcement if there are any questions.

Are there consistent taxonomy examples/alumni career outcomes?

Taxonomy of trainee pathways may vary depending on the population and reporting needs, and evaluators should reference literature in their field to learn taxonomy standards. At the beginning of the evaluation, teams should define their terms and then use them clearly and consistently throughout the survey administration and data analysis processes.

Relevant Resource:

Evolution of a Functional Taxonomy of Career Pathways for Biomedical Trainees. Mathur A, Brandt P, Chalkley R, Daniel L, Labosky P, Stayart C, Meyers F. Journal of Clinical and Translational Science. 2018 Apr; 2(2):63-65. https://doi.org/10.1017%2Fcts.2018.22

Related Information

This page last updated on 04/22/2025 1:50 PM