Evaluation Terms of Reference Quality Checklist

Purpose

Ensure your Terms of Reference will produce an evaluation that changes decisions, not one that gathers dust. A poorly designed ToR guarantees a useless evaluation, regardless of who conducts it. This checklist helps you catch design failures before you spend money.

How to Use

Before commissioning any evaluation, work through this checklist with your ToR draft. If you cannot answer "yes" to the core questions in Section 1, stop. Redesign the ToR before releasing it. An evaluation without decision linkage is an expensive report no one will use.

Section 1: Decision Linkage (Core Gate)

[ ] Can you name 2-3 specific decisions this evaluation will inform?
[ ] Are those decisions written into the ToR, not assumed?
[ ] For each decision, is there a named owner (not "senior management")?
[ ] Are decision owners aware the evaluation is coming and expecting to use it?
[ ] Is there a documented date by which each decision must be made?
[ ] Will evaluation findings arrive at least 4 weeks before decision deadlines?
[ ] If the evaluation finds X, can you state what will change? What if it finds the opposite?
[ ] Would canceling this evaluation eliminate any actual decision-making capacity?

Section 2: Timing Alignment

[ ] Is the evaluation scheduled to inform decisions at a natural decision point?
[ ] Are you avoiding the "end-of-project evaluation" that informs nothing because the project is over?
[ ] For mid-term evaluations, is there genuine willingness to change course based on findings?
[ ] Have you mapped the reporting chain from evaluation completion to decision meeting?
[ ] Is the evaluation timeline realistic given procurement, fieldwork, analysis, and review?
[ ] Have you built in time for decision-makers to read and discuss findings before acting?
[ ] Is the decision timeline fixed, or will it slip and make the evaluation irrelevant?

Section 3: Stakeholder Buy-In

[ ] Do decision-makers actually want this evaluation, or is it a compliance requirement?
[ ] If compliance-driven, have you negotiated with the donor to make it useful anyway?
[ ] Have you consulted implementing partners about evaluation questions?
[ ] Will field teams see this as helpful learning or as external judgment?
[ ] Have you assessed whether communities have evaluation fatigue from prior studies?
[ ] Is there genuine organizational appetite to hear difficult findings?
[ ] Have you identified who might resist findings and planned how to address resistance?

Section 4: Realistic Scope

[ ] Does the ToR ask answerable questions, or aspirational ones no methodology can address?
[ ] Have you limited evaluation questions to 3-5 primary questions (not 12)?
[ ] For each question, have you verified it can be answered with available data and budget?
[ ] Is the proposed timeline sufficient for quality fieldwork and analysis?
[ ] Is the budget sufficient for the scope requested, including adequate field time?
[ ] Have you avoided asking "impact" questions when the timeframe makes attribution impossible?
[ ] Have you specified what quality looks like (minimum sample sizes, triangulation requirements)?

Section 5: Question Quality

[ ] Are evaluation questions phrased to inform decisions, not just describe activities?
[ ] Have you removed "what happened" questions in favor of "what should change" questions?
[ ] Do questions focus on genuinely uncertain areas, not things you already know?
[ ] Are questions specific enough that two evaluators would interpret them the same way?
[ ] Have you avoided jargon that evaluators might interpret differently than intended?
[ ] For "effectiveness" questions, have you specified: effective for whom, by what measure?
[ ] Have you tested questions with potential evaluators to ensure shared understanding?

Section 6: Implementation Pathway

[ ] Have you specified who will receive the evaluation report?
[ ] Is there a scheduled meeting where decision-makers will discuss findings?
[ ] Have you assigned someone to develop a management response?
[ ] Is there a timeline for implementing recommendations?
[ ] Will you track whether recommendations are actually implemented?
[ ] Have you planned how findings will be shared with communities and partners?
[ ] Is there budget for dissemination and follow-up, not just the evaluation itself?

Section 7: Avoiding Evaluation Theater

[ ] Are you commissioning this evaluation because you need to learn, or because funders expect it?
[ ] If funder-driven, have you negotiated to make it decision-relevant anyway?
[ ] Are you calling this a "learning evaluation" while secretly hoping for validation?
[ ] Is this evaluation designed to confirm what you already believe?
[ ] Would you proceed with this evaluation if no one external ever saw the results?
[ ] Have you been honest about whether the organization is willing to change based on findings?
[ ] If you are honest, will this evaluation change anything, or produce a report?

Red Flags

This checklist helps identify:

Ghost decisions: "This will inform program improvement" without naming what will improve
Post-project evaluations: Evaluating after all decisions have been made and the project ends
Aspirational scope: Asking 15 questions on a budget that supports 4
Compliance theater: Commissioning evaluations because donors require them, not because you need them
Implementation gaps: No plan for how findings become action
Accountability disguised as learning: Calling it "formative" while really wanting a passing grade