How detailed should the methodology section of a TOR be?

Detailed enough to set expectations, not so detailed that you are designing the evaluation for the evaluator. Specify the general approach (mixed methods, theory-based, quasi-experimental), the types of data collection expected (surveys, interviews, document review), and any non-negotiable requirements (comparison group, minimum sample size). Leave the specific design to the evaluator. That is what you are hiring them for.

How many evaluation questions should a TOR include?

Three to eight. Fewer than three means the evaluation is unfocused or too narrow. More than eight means you are trying to answer everything in one evaluation, which guarantees shallow answers across the board. Each question should be answerable with the budget, timeline, and methodology you are proposing. If a question requires its own dedicated study, it does not belong in this TOR.

How much should an external evaluation cost?

Rapid assessment: $5,000-15,000. Mid-term review: $20,000-50,000. Final evaluation: $40,000-120,000. Impact evaluation with comparison group: $100,000-500,000+. These ranges assume a developing country context with external evaluators. The main cost drivers are sample size, geographic spread, number of data collection methods, and evaluator seniority.

How to Write Evaluation Terms of Reference

Who This Page Is For

You need to commission an evaluation. Maybe it is a final evaluation required by your donor agreement. Maybe it is a mid-term review your team decided to conduct. Either way, you need to write an evaluation TOR that attracts qualified evaluators and gives them enough direction to deliver useful findings.

A weak TOR produces a weak evaluation. If the scope is vague, the evaluator will define it for you, and you may not like their choices. If the evaluation questions are unfocused, you will get a 60-page report that answers nothing actionable. This guide walks you through each section of a TOR and tells you exactly what to include.

Before You Write: Answer the Use Question

Before drafting a single section, answer this question: Who will act on the findings, and how?

If you cannot answer that, you are not ready to write a TOR. You are commissioning a report that will sit on a shelf. Run the Evaluation Readiness assessment first.

Write down the specific decisions the evaluation will inform. "The country director will use findings to decide whether to scale the livelihood component to three additional districts" is a use statement. "We will learn about program effectiveness" is not.

This use statement shapes everything that follows: the evaluation questions, the methodology, the timeline, the audience for the report. Write it first. Put it in the TOR. Reference it when you review the evaluator's inception report.

The 8 Sections of an Evaluation TOR

1. Background and Context

Provide enough context for an external evaluator to understand the program without reading 200 pages of project documents.

Include:

Program name, duration, geographic scope, and implementing partners
Target population and approximate reach
Total budget and funding source (without naming specific donors if confidentiality applies)
A 2-3 sentence summary of the program's theory of change
Key contextual factors the evaluator needs to understand (conflict, COVID disruption, policy changes)

Avoid: Pasting the entire project description from the proposal. The evaluator does not need five pages of background. They need one page that orients them, plus a list of key documents you will share during inception.

Write this: "The program operates in 12 districts across two regions, targeting 15,000 smallholder farming households. It aims to increase household food security through improved agricultural practices, market access, and nutrition behavior change."

Not this: "The project, which commenced in January 2023, is a multi-sectoral integrated development initiative that leverages synergies across agriculture, nutrition, and market systems to address the root causes of food insecurity among vulnerable populations in rural areas."

2. Evaluation Purpose and Use

State why this evaluation is happening and what decisions it will inform. This is where your pre-writing use statement goes.

Include:

The primary purpose: accountability (proving results), learning (improving the program), or both
The specific decisions the evaluation will inform
The primary audience: program team, senior leadership, donor, government partners, beneficiaries
How findings will be disseminated and used

The test: If you remove this section and the evaluation still makes sense, the section is too generic. It should be impossible to swap this purpose statement into a different program's TOR without rewriting it.

3. Scope and Evaluation Questions

This is the hardest section and the most important one. Get the evaluation questions wrong and nothing else matters.

Scope defines boundaries. Specify:

Time period covered (full program, last two years, specific phase)
Geographic coverage (all sites or a sample)
Thematic coverage (all components or specific ones)
Evaluation criteria to be addressed (relevance, coherence, effectiveness, efficiency, impact, sustainability)

Evaluation questions define what the evaluator must answer. Rules:

3-8 questions maximum. Each question costs money to answer well. Eight questions with a $40,000 budget means $5,000 per question. That is barely enough for one method per question. Prioritize ruthlessly.
Each question must be answerable. If you ask "What was the program's impact on household food security?" but your budget cannot support a comparison group or counterfactual analysis, the evaluator cannot answer that question credibly. Match questions to methodology and budget.
Each question must be actionable. Ask yourself: if the evaluator answers this question, will someone do something differently? If not, cut it.
Organize questions under evaluation criteria. This helps evaluators structure their matrix and helps you confirm coverage.

Write this: "To what extent did the agricultural training component lead to adoption of improved farming practices among targeted households, and what factors enabled or hindered adoption?"

Not this: "What was the impact of the program?"

Use the Evaluation Designer to test whether your questions, methodology, and budget are aligned before finalizing the TOR.

4. Methodology Expectations

Set expectations without designing the evaluation yourself. The evaluator's job is to propose a detailed methodology in the inception report. Your job is to define the boundaries.

Include:

General approach expected: mixed methods, theory-based, quasi-experimental. See How to Choose Evaluation Methodology.
Non-negotiable requirements: comparison group, minimum sample size, specific disaggregation (gender, age, disability status), participatory methods
Data sources available: existing monitoring data, baseline reports, program databases
Expected data collection methods (surveys, key informant interviews, focus groups, document review)
Ethical requirements: informed consent, data protection, IRB approval if needed

Avoid: Specifying the exact sample size, sampling strategy, and analytical framework. That is the evaluator's expertise. If you prescribe every detail, you are not hiring an evaluator. You are hiring a data collection firm.

The balance: "We expect a mixed-methods approach combining a household survey (minimum 400 households, disaggregated by gender of household head) with qualitative interviews and focus groups across at least 6 of the 12 program districts." That gives enough direction without designing the study.

5. Deliverables and Timeline

Be specific. Vague timelines produce delayed evaluations.

Standard deliverables:

Deliverable	Typical Timeline
Inception report (detailed methodology, workplan, tools)	2-3 weeks after contract signing
Draft data collection tools	With inception report
Fieldwork completion	3-6 weeks after inception approval
Draft evaluation report	2-3 weeks after fieldwork
Presentation of preliminary findings	With or before draft report
Final evaluation report	2 weeks after receiving comments
Clean datasets and codebooks	With final report
Management response template	With final report

Include: Total calendar duration from contract to final report. Build in time for your team to review and comment on the inception report and draft report. Those review periods often take longer than the evaluator's work.

Avoid: Setting a timeline that makes quality impossible. A final evaluation with primary data collection in under 8 weeks total is unrealistic. If the donor deadline forces a compressed timeline, say so explicitly and adjust scope accordingly.

6. Evaluator Qualifications

Specify what you need. Be concrete.

Required qualifications (typical):

Advanced degree in a relevant field
Minimum years of evaluation experience (7-10 for team lead)
Experience with the specific methodology you expect
Sector expertise (food security, health, education, governance)
Regional or country experience
Language requirements (for the team, not necessarily the lead)

Selection criteria and weighting:

Criterion	Suggested Weight
Technical approach and methodology	30-40%
Team qualifications and experience	25-30%
Understanding of context and evaluation questions	15-20%
Budget and value for money	15-20%

Avoid: Requiring 15 years of experience in the exact sub-sector in the exact country. You will eliminate strong candidates and end up with one applicant. Require core skills and weight relevant experience in the scoring.

7. Budget

You have two options: state a fixed budget ceiling or ask evaluators to propose their own. Each has tradeoffs.

Fixed ceiling: Evaluators design within your constraints. You get comparable proposals. Risk: evaluators cut corners to fit the budget.

Open budget: You see what a quality evaluation actually costs. Risk: proposals range from $20,000 to $200,000 and you cannot compare them.

Recommended approach: State a budget range. "The available budget for this evaluation is $40,000-55,000, inclusive of all costs." This anchors proposals and still allows variation.

Include what the budget must cover: evaluator fees, travel, data collection (enumerators, translations, logistics), report production, and any specific costs like IRB fees.

8. Management and Ethics

Management arrangements:

Who manages the evaluation (M&E Manager, program director, steering committee)?
Who is the primary point of contact?
What documents and data will be shared with the evaluator?
How will comments on deliverables be consolidated (one document, not 10 separate emails)?
Who approves the inception report, draft report, and final report?

Ethical requirements:

Informed consent protocols
Data confidentiality and storage
Do-no-harm considerations, especially for vulnerable populations
Safeguarding requirements
IRB or ethics committee approval if required by your organization or donor

Do not skip ethics. If your organization has a research ethics policy, reference it. If it does not, state the minimum standards you expect.

Common Mistakes

Mistake 1: Writing evaluation questions after the methodology is decided. The questions come first. The methodology serves the questions. If you start with "we want a quasi-experimental evaluation" and then write questions to justify it, you will end up with questions shaped by the method rather than by what you need to know.

Mistake 2: Asking too many evaluation questions. Every question you add dilutes the depth of every other question. With a $50,000 budget and 10 questions, you are paying $5,000 per question. That buys a shallow answer. Five focused questions with $10,000 each gets you useful evidence.

Mistake 3: No use plan. If the TOR does not explain who will use the findings and how, the evaluation becomes an expensive compliance exercise. Write the use plan before you write the evaluation questions. The questions should generate the evidence the decision-makers need.

Mistake 4: Timelines that make quality impossible. Asking for a final evaluation with primary data collection, analysis, and reporting in 4 weeks is asking for a bad evaluation. Build in adequate time for inception, fieldwork, analysis, drafting, review, and revision. If the timeline is non-negotiable, reduce the scope to match.

Mistake 5: Specifying every methodological detail. If you prescribe the sample size, sampling frame, analytical framework, and interview guides, you are not commissioning an evaluation. You are hiring data collectors. Set expectations and constraints. Let the evaluator propose the design. Evaluate their proposal on its technical merit.

Mistake 6: Forgetting the inception report. The inception report is where the evaluator translates your TOR into a concrete plan. If your TOR does not require one, the evaluator will go straight to fieldwork using whatever interpretation of your questions they prefer. Always require an inception report with a review and approval step before fieldwork begins.

TOR Checklist

Use this before you circulate the TOR.

Who This Page Is For

Before You Write: Answer the Use Question

Before drafting a single section, answer this question: Who will act on the findings, and how?

If you cannot answer that, you are not ready to write a TOR. You are commissioning a report that will sit on a shelf. Run the Evaluation Readiness assessment first.

The 8 Sections of an Evaluation TOR

1. Background and Context

Provide enough context for an external evaluator to understand the program without reading 200 pages of project documents.

Include:

Program name, duration, geographic scope, and implementing partners
Target population and approximate reach
Total budget and funding source (without naming specific donors if confidentiality applies)
A 2-3 sentence summary of the program's theory of change
Key contextual factors the evaluator needs to understand (conflict, COVID disruption, policy changes)

2. Evaluation Purpose and Use

State why this evaluation is happening and what decisions it will inform. This is where your pre-writing use statement goes.

Include:

The primary purpose: accountability (proving results), learning (improving the program), or both
The specific decisions the evaluation will inform
The primary audience: program team, senior leadership, donor, government partners, beneficiaries
How findings will be disseminated and used

3. Scope and Evaluation Questions

This is the hardest section and the most important one. Get the evaluation questions wrong and nothing else matters.

Scope defines boundaries. Specify:

Time period covered (full program, last two years, specific phase)
Geographic coverage (all sites or a sample)
Thematic coverage (all components or specific ones)
Evaluation criteria to be addressed (relevance, coherence, effectiveness, efficiency, impact, sustainability)

Evaluation questions define what the evaluator must answer. Rules:

3-8 questions maximum. Each question costs money to answer well. Eight questions with a $40,000 budget means $5,000 per question. That is barely enough for one method per question. Prioritize ruthlessly.
Each question must be answerable. If you ask "What was the program's impact on household food security?" but your budget cannot support a comparison group or counterfactual analysis, the evaluator cannot answer that question credibly. Match questions to methodology and budget.
Each question must be actionable. Ask yourself: if the evaluator answers this question, will someone do something differently? If not, cut it.
Organize questions under evaluation criteria. This helps evaluators structure their matrix and helps you confirm coverage.

Write this: "To what extent did the agricultural training component lead to adoption of improved farming practices among targeted households, and what factors enabled or hindered adoption?"

Not this: "What was the impact of the program?"

Use the Evaluation Designer to test whether your questions, methodology, and budget are aligned before finalizing the TOR.

4. Methodology Expectations

Set expectations without designing the evaluation yourself. The evaluator's job is to propose a detailed methodology in the inception report. Your job is to define the boundaries.

Include:

General approach expected: mixed methods, theory-based, quasi-experimental. See How to Choose Evaluation Methodology.
Non-negotiable requirements: comparison group, minimum sample size, specific disaggregation (gender, age, disability status), participatory methods
Data sources available: existing monitoring data, baseline reports, program databases
Expected data collection methods (surveys, key informant interviews, focus groups, document review)
Ethical requirements: informed consent, data protection, IRB approval if needed

5. Deliverables and Timeline

Be specific. Vague timelines produce delayed evaluations.

Standard deliverables:

Deliverable	Typical Timeline
Inception report (detailed methodology, workplan, tools)	2-3 weeks after contract signing
Draft data collection tools	With inception report
Fieldwork completion	3-6 weeks after inception approval
Draft evaluation report	2-3 weeks after fieldwork
Presentation of preliminary findings	With or before draft report
Final evaluation report	2 weeks after receiving comments
Clean datasets and codebooks	With final report
Management response template	With final report

6. Evaluator Qualifications

Specify what you need. Be concrete.

Required qualifications (typical):

Advanced degree in a relevant field
Minimum years of evaluation experience (7-10 for team lead)
Experience with the specific methodology you expect
Sector expertise (food security, health, education, governance)
Regional or country experience
Language requirements (for the team, not necessarily the lead)

Selection criteria and weighting:

Criterion	Suggested Weight
Technical approach and methodology	30-40%
Team qualifications and experience	25-30%
Understanding of context and evaluation questions	15-20%
Budget and value for money	15-20%

7. Budget

You have two options: state a fixed budget ceiling or ask evaluators to propose their own. Each has tradeoffs.

Fixed ceiling: Evaluators design within your constraints. You get comparable proposals. Risk: evaluators cut corners to fit the budget.

Open budget: You see what a quality evaluation actually costs. Risk: proposals range from $20,000 to $200,000 and you cannot compare them.

Recommended approach: State a budget range. "The available budget for this evaluation is $40,000-55,000, inclusive of all costs." This anchors proposals and still allows variation.

Include what the budget must cover: evaluator fees, travel, data collection (enumerators, translations, logistics), report production, and any specific costs like IRB fees.

8. Management and Ethics

Management arrangements:

Who manages the evaluation (M&E Manager, program director, steering committee)?
Who is the primary point of contact?
What documents and data will be shared with the evaluator?
How will comments on deliverables be consolidated (one document, not 10 separate emails)?
Who approves the inception report, draft report, and final report?

Ethical requirements:

Informed consent protocols
Data confidentiality and storage
Do-no-harm considerations, especially for vulnerable populations
Safeguarding requirements
IRB or ethics committee approval if required by your organization or donor

Do not skip ethics. If your organization has a research ethics policy, reference it. If it does not, state the minimum standards you expect.

Common Mistakes

TOR Checklist

Use this before you circulate the TOR.

How to Write Evaluation Terms of Reference

Who This Page Is For

Before You Write: Answer the Use Question

The 8 Sections of an Evaluation TOR

1. Background and Context

2. Evaluation Purpose and Use

3. Scope and Evaluation Questions

4. Methodology Expectations

5. Deliverables and Timeline

6. Evaluator Qualifications

7. Budget

8. Management and Ethics

Common Mistakes

TOR Checklist

Frequently Asked Questions

How to Write Evaluation Terms of Reference

Who This Page Is For

Before You Write: Answer the Use Question

The 8 Sections of an Evaluation TOR

1. Background and Context

2. Evaluation Purpose and Use

3. Scope and Evaluation Questions

4. Methodology Expectations

5. Deliverables and Timeline

6. Evaluator Qualifications

7. Budget

8. Management and Ethics

Common Mistakes

TOR Checklist

Frequently Asked Questions