Evaluation Terms of Reference

When to Use

An evaluation terms of reference (ToR) is the right document when you are commissioning an evaluation - whether internal or external - and need to establish clear expectations for what will be evaluated, why, how, and what outputs are expected. Use it when:

Commissioning an external evaluation: to solicit proposals from qualified evaluators and serve as the basis for the evaluation contract. The ToR tells potential evaluators what you need and allows them to propose their approach, methodology, and fees.
Aligning stakeholders before evaluation planning: to ensure all internal stakeholders agree on the evaluation's purpose, scope, and key questions before investing in the evaluation process. A well-developed ToR prevents scope creep and conflicting expectations later.
Specifying evaluator requirements: to define the competencies, experience, and qualifications required of the evaluation team. This is critical when selecting evaluators through competitive bidding.
Establishing evaluation timeline and deliverables: to set clear expectations for when the evaluation will occur, what reports and other outputs are expected, and when they are due.

An evaluation ToR is less useful when you are still determining whether an evaluation is needed (conduct an evaluability assessment first) or when you need to develop the evaluation questions and criteria (these should be informed by the ToR development process but may be refined during evaluation planning).

Scenario	Use Evaluation ToR?	Better Alternative
Commissioning external evaluation	Yes	-
Still assessing if evaluation is needed	No	Evaluability Assessment
Developing evaluation questions	Alongside	Evaluation Questions
Selecting internal evaluator	Yes, simplified version	-
Post-evaluation learning review	No	Lessons Learned

How It Works

Developing an evaluation terms of reference follows a structured process. The sequence matters - each section builds on the previous one and informs the next.

Define the evaluation purpose and objectives. Start by articulating why this evaluation is being conducted. Is it a compliance requirement? A learning opportunity? A accountability exercise? The purpose drives everything else. Be specific: "To assess the effectiveness of the maternal health program and identify lessons for scale-up" is actionable; "To evaluate the project" is not. This section should also specify what type of evaluation this is (formative, summative, mid-term, final, impact, etc.).
Specify what will be evaluated. Define the evaluand - the program, project, or intervention being evaluated. Include the timeframe, geographic scope, target populations, and any specific components or activities that are in or out of scope. This prevents evaluators from making assumptions about what they should examine.
Articulate the evaluation questions. These should flow directly from the purpose and be aligned with the OECD/DAC evaluation criteria (relevance, effectiveness, efficiency, impact, sustainability). The questions are the heart of the ToR - they determine what data will be collected and how it will be analyzed. Ensure they are specific enough to guide methodology but flexible enough to allow evaluator innovation.
Define the methodology requirements. Specify what methodological approaches are required or preferred. Will the evaluation need a comparative component? Qualitative depth? Quantitative rigor? Participatory approaches? Any constraints on data collection methods? This section should balance specificity with allowing evaluators to propose their optimal approach.
Specify evaluator qualifications and team composition. Define the competencies required of the evaluation team. What sector experience is needed? What methodological expertise? Language requirements? Local knowledge? Specify whether an external evaluator is required, whether local partners should be included, and any diversity requirements for the team.
Establish timeline and deliverables. Specify when the evaluation should occur, key milestones, and all expected deliverables (inception report, data collection tools, draft report, final report, presentation, etc.). Include the evaluation deadline and any donor reporting requirements that drive the timeline.
Define reporting lines and stakeholder engagement. Specify who the evaluation will report to, who should be consulted during the evaluation, and how stakeholder feedback will be incorporated. This is where you specify stakeholder engagement requirements and ensure the evaluation is utilization-focused.

Key Components

A well-constructed evaluation terms of reference includes these essential elements:

Background and context: a concise description of the program or project being evaluated, including its objectives, implementation history, and current status. This provides evaluators with the context they need to understand what they are assessing.
Purpose and objectives: a clear statement of why the evaluation is being conducted and what it aims to achieve. This drives all other sections and ensures the evaluation serves its intended function.
Scope of the evaluation: explicit definition of what is included and excluded from the evaluation, including geographic scope, time period, program components, and population groups. Clear scope prevents scope creep and unrealistic expectations.
Evaluation questions: the specific questions the evaluation will answer, typically organized by evaluation criterion (relevance, effectiveness, efficiency, impact, sustainability). These are the core of the ToR.
Methodology requirements: specifications for the evaluation approach, including required methods, sampling requirements, data collection tools, and analysis approaches. This section should specify requirements while allowing evaluator innovation.
Evaluator qualifications: detailed description of the competencies, experience, and qualifications required of the evaluation team. This forms the basis for evaluator selection.
Deliverables and timeline: a complete list of expected outputs with deadlines, including inception report, data collection instruments, draft and final reports, presentations, and any other required outputs.
Reporting and stakeholder engagement: specification of who the evaluation will report to, how stakeholders will be engaged, and how evaluation findings will be disseminated and used.
Budget and logistics: any budget constraints, logistical support provided, and practical considerations that evaluators need to know when proposing their approach.

Best Practices

Start evaluation planning early. Evaluation planning must specify what will be evaluated, why, by whom, when, and how results will be used. Connect evaluation questions to monitoring data where possible. Begin this process at the beginning of the program, not at the end.

Develop the ToR before evaluator selection. Evaluation planning should start at the beginning of the program, and the ToR should be developed before selecting evaluators. This ensures you have clarity on your needs before engaging external parties.

Ensure evaluation questions are well-formed. Evaluation questions should be developed and agreed on at the beginning of evaluation planning. They should be derived from the purpose of the evaluation and aligned with the OECD/DAC criteria. Well-formed questions guide methodology and ensure the evaluation answers what matters.

Be specific about evaluator competencies. When selecting external evaluators, weigh familiarity with project type, background in evaluation type being carried out, knowledge of local environment, and methodological expertise. Include outside evaluators to ensure healthy distance and objectivity.

Include participatory approaches. Evaluation plans should propose participatory evaluation approaches that engage stakeholders throughout the process. This increases ownership and utilization of findings.

Specify appropriate staffing. The evaluation plan must propose an appropriate staffing plan for its conduct, whether internal or external. Consider the evaluation's complexity, required expertise, and organizational capacity when determining staffing needs.

Balance specificity with flexibility. The ToR should specify requirements clearly but allow evaluators to propose their optimal methodology. A ToR that is too prescriptive limits evaluator innovation; one that is too vague leads to proposals that don't meet your needs.

Ensure adequate resources for the scope. An evaluation with wide scope but very limited resources can be worse than no evaluation at all if it leads to superficial conclusions. Ensure the evaluation budget and timeline are adequate to answer the key questions adequately. Three weeks is typically insufficient for a meaningful evaluation.

Common Mistakes

Developing the ToR after evaluator selection. The most common failure is selecting evaluators first and then developing the ToR to match their proposals, rather than defining your needs first and then selecting evaluators who can meet them. This reverses the proper sequence and can lead to evaluations that don't answer your questions.

Writing evaluation questions that are too broad or too narrow. Evaluation questions that are too broad ("Was the project successful?") cannot be answered rigorously. Questions that are too narrow miss the bigger picture. Questions should be specific enough to guide data collection but broad enough to capture meaningful insights.

Setting unrealistic timelines. Final project evaluations should be planned at least six months prior to project completion to allow adequate time for data collection, analysis, and reporting. Compressing an evaluation into the final weeks of a project leads to rushed work and superficial findings.

Failing to specify evaluator qualifications. A ToR that doesn't clearly specify required competencies leads to proposals from evaluators who may be methodologically sound but lack sector expertise or local knowledge. This can result in evaluations that miss important contextual factors.

Using the evaluation to resolve internal disputes. Evaluations should not be used to solve internal disputes or mediate between conflicting views about the value or future direction of a project. This undermines the evaluation's objectivity and can lead to politicized findings.

Setting inadequate timelines for the scope. Evaluation scope of work should be focused on only a few key questions and include the resources (time, staff, budget) to answer them adequately. Three weeks is typically insufficient for a meaningful evaluation.

Ignoring utilization from the start. A ToR that doesn't specify how findings will be used, who the primary users are, and how stakeholders will be engaged risks producing an evaluation that sits on a shelf. Ensure the ToR specifies utilization planning as part of the evaluation process.

Examples

Health - Sub-Saharan Africa (Final Evaluation)

A USAID-funded maternal health program in three countries required a final evaluation six months before project close. The ToR specified: (1) five evaluation questions aligned with OECD/DAC criteria, (2) a mixed-methods design requiring quantitative outcome data and qualitative case studies, (3) evaluator qualifications including maternal health expertise and experience with USAID evaluations, (4) a detailed timeline with inception report (2 weeks), data collection (6 weeks), and final report (2 weeks), and (5) explicit stakeholder engagement requirements including validation workshops in each country. The ToR was developed three months before evaluator selection, allowing for a competitive bidding process that resulted in a consortium of international and local evaluators. The evaluation identified key lessons for scale-up that informed the follow-on program design.

Governance - Southeast Asia (Mid-term Evaluation)

A governance program working with civil society organizations commissioned a mid-term evaluation to assess progress and inform adaptive management decisions. The ToR emphasized participatory approaches, requiring stakeholder interviews with government partners, CSO beneficiaries, and donor representatives. It specified that the evaluation should use outcome harvesting as a primary method to capture both planned and unplanned outcomes. The evaluator qualifications required experience with participatory evaluation and governance programming. The timeline included a mid-term validation workshop to ensure findings informed program adjustments before project close. The evaluation revealed significant unplanned outcomes through informal influence pathways, leading to program adaptation.

Education - South Asia (Impact Evaluation)

A foundation-funded education program required a quasi-experimental impact evaluation to assess learning outcomes. The ToR specified rigorous methodology requirements including comparison group design, power analysis for sample size, and longitudinal data collection. It required evaluators to have experience with impact evaluation in education and statistical analysis capabilities. The ToR also specified that the evaluation should include cost-effectiveness analysis. Evaluator selection was based on technical proposals demonstrating methodological expertise, with a separate budget proposal. The resulting evaluation provided robust evidence on program effectiveness that informed scaling decisions.

Compared To

An evaluation terms of reference is one of several documents used in evaluation planning and management. The key differences:

Feature	Evaluation ToR	Evaluation Matrix	Inception Report	MEL Plan
Primary purpose	Commission evaluation and select evaluators	Map evaluation questions to data sources	Detail evaluation approach post-selection	Guide overall M&E system
When developed	Before evaluator selection	During evaluation planning	After evaluator selection, before data collection	At program design
Audience	Potential evaluators, selection committee	Evaluation team, internal stakeholders	Evaluation team, stakeholders	Program team, donors
Level of detail	Requirements and expectations	Question-to-evidence mapping	Methodological specifics	Complete M&E guidance
Ownership	Commissioning organization	Evaluation team (with stakeholder input)	External evaluator	Program management

Proposal Context

A proposal's evaluation plan must specify what the evaluation terms of reference will commit to, even when the full TOR is drafted after award. Donors read the evaluation section of the proposal to assess whether the program is designing for accountability and learning. Common proposal pitfalls: (a) an evaluation plan that lists evaluation types without specifying purpose-use statements (what decisions findings will inform), (b) no commitment on evaluator independence (external, internal, or mixed, and with what controls), (c) evaluation timing that leaves no window for findings to influence decisions, (d) ambitious methodology not matched by the evaluation budget (typically 2-4% of program budget for end-of-project, higher for impact evaluations with comparison groups, see summative evaluation), (e) no linkage between proposed evaluation questions and the DAC criteria the donor expects. A proposal with a clear evaluation plan, specifying timing, type, independence, and budget, signals design maturity.

Relevant Indicators

12 indicators across 5 major donor frameworks (USAID, DFID, UNDP, IFRC, Global Fund) relate to evaluation terms of reference development and use:

ToR development timing: "Proportion of evaluations with terms of reference developed prior to evaluator selection" (USAID)
Participatory approaches: "Percentage of evaluation ToRs that include participatory approaches" (UNDP)
Evaluator competency specification: "Proportion of evaluation ToRs that specify required evaluator competencies" (DFID)
Timeline adequacy: "Average time between ToR development and evaluation completion" (IFRC)

When to Use

Commissioning an external evaluation: to solicit proposals from qualified evaluators and serve as the basis for the evaluation contract. The ToR tells potential evaluators what you need and allows them to propose their approach, methodology, and fees.
Aligning stakeholders before evaluation planning: to ensure all internal stakeholders agree on the evaluation's purpose, scope, and key questions before investing in the evaluation process. A well-developed ToR prevents scope creep and conflicting expectations later.
Specifying evaluator requirements: to define the competencies, experience, and qualifications required of the evaluation team. This is critical when selecting evaluators through competitive bidding.
Establishing evaluation timeline and deliverables: to set clear expectations for when the evaluation will occur, what reports and other outputs are expected, and when they are due.

Scenario	Use Evaluation ToR?	Better Alternative
Commissioning external evaluation	Yes	-
Still assessing if evaluation is needed	No	Evaluability Assessment
Developing evaluation questions	Alongside	Evaluation Questions
Selecting internal evaluator	Yes, simplified version	-
Post-evaluation learning review	No	Lessons Learned

How It Works

Developing an evaluation terms of reference follows a structured process. The sequence matters - each section builds on the previous one and informs the next.

Define the evaluation purpose and objectives. Start by articulating why this evaluation is being conducted. Is it a compliance requirement? A learning opportunity? A accountability exercise? The purpose drives everything else. Be specific: "To assess the effectiveness of the maternal health program and identify lessons for scale-up" is actionable; "To evaluate the project" is not. This section should also specify what type of evaluation this is (formative, summative, mid-term, final, impact, etc.).
Specify what will be evaluated. Define the evaluand - the program, project, or intervention being evaluated. Include the timeframe, geographic scope, target populations, and any specific components or activities that are in or out of scope. This prevents evaluators from making assumptions about what they should examine.
Articulate the evaluation questions. These should flow directly from the purpose and be aligned with the OECD/DAC evaluation criteria (relevance, effectiveness, efficiency, impact, sustainability). The questions are the heart of the ToR - they determine what data will be collected and how it will be analyzed. Ensure they are specific enough to guide methodology but flexible enough to allow evaluator innovation.
Define the methodology requirements. Specify what methodological approaches are required or preferred. Will the evaluation need a comparative component? Qualitative depth? Quantitative rigor? Participatory approaches? Any constraints on data collection methods? This section should balance specificity with allowing evaluators to propose their optimal approach.
Specify evaluator qualifications and team composition. Define the competencies required of the evaluation team. What sector experience is needed? What methodological expertise? Language requirements? Local knowledge? Specify whether an external evaluator is required, whether local partners should be included, and any diversity requirements for the team.
Establish timeline and deliverables. Specify when the evaluation should occur, key milestones, and all expected deliverables (inception report, data collection tools, draft report, final report, presentation, etc.). Include the evaluation deadline and any donor reporting requirements that drive the timeline.
Define reporting lines and stakeholder engagement. Specify who the evaluation will report to, who should be consulted during the evaluation, and how stakeholder feedback will be incorporated. This is where you specify stakeholder engagement requirements and ensure the evaluation is utilization-focused.

Key Components

A well-constructed evaluation terms of reference includes these essential elements:

Background and context: a concise description of the program or project being evaluated, including its objectives, implementation history, and current status. This provides evaluators with the context they need to understand what they are assessing.
Purpose and objectives: a clear statement of why the evaluation is being conducted and what it aims to achieve. This drives all other sections and ensures the evaluation serves its intended function.
Scope of the evaluation: explicit definition of what is included and excluded from the evaluation, including geographic scope, time period, program components, and population groups. Clear scope prevents scope creep and unrealistic expectations.
Evaluation questions: the specific questions the evaluation will answer, typically organized by evaluation criterion (relevance, effectiveness, efficiency, impact, sustainability). These are the core of the ToR.
Methodology requirements: specifications for the evaluation approach, including required methods, sampling requirements, data collection tools, and analysis approaches. This section should specify requirements while allowing evaluator innovation.
Evaluator qualifications: detailed description of the competencies, experience, and qualifications required of the evaluation team. This forms the basis for evaluator selection.
Deliverables and timeline: a complete list of expected outputs with deadlines, including inception report, data collection instruments, draft and final reports, presentations, and any other required outputs.
Reporting and stakeholder engagement: specification of who the evaluation will report to, how stakeholders will be engaged, and how evaluation findings will be disseminated and used.
Budget and logistics: any budget constraints, logistical support provided, and practical considerations that evaluators need to know when proposing their approach.

Best Practices

Common Mistakes

Examples

Health - Sub-Saharan Africa (Final Evaluation)

Governance - Southeast Asia (Mid-term Evaluation)

Education - South Asia (Impact Evaluation)

Compared To

An evaluation terms of reference is one of several documents used in evaluation planning and management. The key differences:

Feature	Evaluation ToR	Evaluation Matrix	Inception Report	MEL Plan
Primary purpose	Commission evaluation and select evaluators	Map evaluation questions to data sources	Detail evaluation approach post-selection	Guide overall M&E system
When developed	Before evaluator selection	During evaluation planning	After evaluator selection, before data collection	At program design
Audience	Potential evaluators, selection committee	Evaluation team, internal stakeholders	Evaluation team, stakeholders	Program team, donors
Level of detail	Requirements and expectations	Question-to-evidence mapping	Methodological specifics	Complete M&E guidance
Ownership	Commissioning organization	Evaluation team (with stakeholder input)	External evaluator	Program management

Proposal Context

Relevant Indicators

12 indicators across 5 major donor frameworks (USAID, DFID, UNDP, IFRC, Global Fund) relate to evaluation terms of reference development and use:

ToR development timing: "Proportion of evaluations with terms of reference developed prior to evaluator selection" (USAID)
Participatory approaches: "Percentage of evaluation ToRs that include participatory approaches" (UNDP)
Evaluator competency specification: "Proportion of evaluation ToRs that specify required evaluator competencies" (DFID)
Timeline adequacy: "Average time between ToR development and evaluation completion" (IFRC)

When to Use

How It Works

Key Components

Best Practices

Common Mistakes

Examples

Health - Sub-Saharan Africa (Final Evaluation)

Governance - Southeast Asia (Mid-term Evaluation)

Education - South Asia (Impact Evaluation)

Compared To

Proposal Context

Relevant Indicators

Related Topics

Evaluation Terms of Reference

When to Use

How It Works

Key Components

Best Practices

Common Mistakes

Examples

Health - Sub-Saharan Africa (Final Evaluation)

Governance - Southeast Asia (Mid-term Evaluation)

Education - South Asia (Impact Evaluation)

Compared To

Proposal Context

Relevant Indicators

Related Topics