Evaluation ToR Quality

AI Prompt Templates

Copy a prompt into Claude, ChatGPT, or Gemini. Paste your document at the bottom and run.

Paste a document and get a scored quality assessment with evidence and revision priorities.

5,601 characters

You are an expert M&E evaluation specialist. Score the evaluation Terms of Reference (ToR) I will provide using the rubric below.

SCORING RUBRIC - Evaluation ToR Quality
Score each dimension 1-5 using these criteria:

DIMENSION 1: Background and Scope Clarity
- Score 5: All four elements present and well-developed. Program described specifically (what, where, when, target population, scale), evaluation purpose explicit (formative, summative, or mixed; at what decision point), intended users named (named roles or organizations), decision context articulated (what decisions will be informed by findings).
- Score 4: At least three of four elements present. May lack named decision context or fully named users.
- Score 3: At least two of four elements present. Program description and purpose stated, but users or decision context generic.
- Score 2: One element present. Background reads as boilerplate without specific scope or users.
- Score 1: No clear background or scope. ToR jumps straight to tasks without context.

DIMENSION 2: Evaluation Questions and Criteria
- Score 5: All four elements present. Evaluation questions are specific and answerable with the proposed methods, each question is mapped to one or more evaluation criteria (relevance, coherence, effectiveness, efficiency, impact, sustainability, or alternatives), sub-questions identified for each main question, and the user or decision-maker for each question is named.
- Score 4: At least three of four elements present. Questions are answerable and mapped to criteria; sub-questions or named users partially present.
- Score 3: At least two of four elements present. Questions present but criteria mapping is implicit. No sub-questions or named users.
- Score 2: Questions are vague ("How successful was the program?") or unanswerable with the proposed methods. No criteria mapping. No sub-questions.
- Score 1: No evaluation questions, OR questions are tasks rather than questions ("Conduct an evaluation of...").

DIMENSION 3: Methodology Specification
- Score 5: All four elements present. Design type named (e.g., quasi-experimental, mixed-methods case study, contribution analysis) with rationale tied to evaluation questions, sampling expectations specified (probability or purposive, approximate size or range), primary and secondary data sources listed, analysis approach described (e.g., mixed-methods integration plan, qualitative coding, statistical methods).
- Score 4: At least three of four elements present. Design and approach named; sampling or analysis briefly described.
- Score 3: At least two of four elements present. Design type named but rationale generic. Sampling or analysis vague.
- Score 2: Methodology described as "mixed methods" or similar without specification. No design rationale.
- Score 1: No methodology specified, OR ToR specifies methods that cannot answer the evaluation questions.

DIMENSION 4: Deliverables, Timeline, and Budget Realism
- Score 5: All five elements present. Deliverables listed with content specifications (e.g., inception report, draft report, final report, presentation), milestones tied to specific dates or weeks, total days and cost or budget envelope stated, payment schedule linked to deliverables, communication plan with frequency and format.
- Score 4: At least four of five elements present. Deliverables and milestones clear; budget or payment schedule may be partial.
- Score 3: At least three of five elements present. Deliverables and timeline present but budget or payment schedule absent. No communication plan.
- Score 2: Two or fewer elements present. Deliverables listed but with no milestones, budget, or schedule.
- Score 1: No specified deliverables, timeline, or budget. ToR is unreasonable to bid on.

DIMENSION 5: Ethics and Stakeholder Engagement Plan
- Score 5: All four elements present. Consent and safeguarding requirements specified for all data subjects, vulnerable population protections defined (children, conflict-affected, marginalized groups), stakeholder validation plan included (when findings will be reviewed by program staff, beneficiaries, or partners), dissemination plan addresses post-evaluation communication.
- Score 4: At least three of four elements present. Consent and validation present; vulnerable population protections or dissemination plan partial.
- Score 3: At least two of four elements present. Consent referenced and one other element present. No detailed safeguarding or validation.
- Score 2: Ethics referenced as a checkbox ("evaluators will follow ethical standards") without specification. No validation or dissemination plan.
- Score 1: No ethics or stakeholder engagement provisions.

OUTPUT FORMAT:
Return your assessment as a table followed by a summary:

| Dimension | Score (1-5) | Evidence from ToR | Priority Revision |
|-----------|-------------|-------------------|-------------------|
| Background and Scope Clarity | | | |
| Evaluation Questions and Criteria | | | |
| Methodology Specification | | | |
| Deliverables, Timeline, and Budget Realism | | | |
| Ethics and Stakeholder Engagement Plan | | | |

**Total: X/25**
**Band:** Strong (22-25) / Adequate (17-21) / Needs Revision (11-16) / Substantial Revision (5-10)
**Single Most Important Revision:** [One specific sentence]
**Procurement Risk:** [None / Minor clarifications needed / Recommend revision before issuing / Do not issue without major revision]

For any dimension scored 1 or 2, add a brief explanation and a concrete revision example.

EVALUATION ToR TO SCORE:
[Paste your evaluation Terms of Reference here]

Scoring Criteria

Background and Scope Clarity

5Excellent

All four elements present and well-developed. Program described specifically (what, where, when, target population, scale), evaluation purpose explicit (formative, summative, or mixed; at what decision point), intended users named (named roles or organizations), decision context articulated (what decisions will be informed by findings).

4Good

At least three of four elements present. May lack named decision context or fully named users.

3Adequate

At least two of four elements present. Program description and purpose stated, but users or decision context generic.

2Needs Improvement

One element present. Background reads as boilerplate without specific scope or users.

1Inadequate

No clear background or scope. ToR jumps straight to tasks without context.

Evaluation Questions and Criteria

5Excellent

All four elements present. Evaluation questions are specific and answerable with the proposed methods, each question is mapped to one or more evaluation criteria (relevance, coherence, effectiveness, efficiency, impact, sustainability, or alternatives), sub-questions identified for each main question, and the user or decision-maker for each question is named.

4Good

At least three of four elements present. Questions are answerable and mapped to criteria; sub-questions or named users partially present.

3Adequate

At least two of four elements present. Questions present but criteria mapping is implicit. No sub-questions or named users.

2Needs Improvement

Questions are vague ("How successful was the program?") or unanswerable with the proposed methods. No criteria mapping. No sub-questions.

1Inadequate

No evaluation questions, OR questions are tasks rather than questions ("Conduct an evaluation of...").

Methodology Specification

5Excellent

All four elements present. Design type named (e.g., quasi-experimental, mixed-methods case study, contribution analysis) with rationale tied to evaluation questions, sampling expectations specified (probability or purposive, approximate size or range), primary and secondary data sources listed, analysis approach described (e.g., mixed-methods integration plan, qualitative coding, statistical methods).

4Good

At least three of four elements present. Design and approach named; sampling or analysis briefly described.

3Adequate

At least two of four elements present. Design type named but rationale generic. Sampling or analysis vague.

2Needs Improvement

Methodology described as "mixed methods" or similar without specification. No design rationale.

1Inadequate

No methodology specified, OR ToR specifies methods that cannot answer the evaluation questions.

Deliverables, Timeline, and Budget Realism

5Excellent

All five elements present. Deliverables listed with content specifications (e.g., inception report, draft report, final report, presentation), milestones tied to specific dates or weeks, total days and cost or budget envelope stated, payment schedule linked to deliverables, communication plan with frequency and format.

4Good

At least four of five elements present. Deliverables and milestones clear; budget or payment schedule may be partial.

3Adequate

At least three of five elements present. Deliverables and timeline present but budget or payment schedule absent. No communication plan.

2Needs Improvement

Two or fewer elements present. Deliverables listed but with no milestones, budget, or schedule.

1Inadequate

No specified deliverables, timeline, or budget. ToR is unreasonable to bid on.

Ethics and Stakeholder Engagement Plan

5Excellent

All four elements present. Consent and safeguarding requirements specified for all data subjects, vulnerable population protections defined (children, conflict-affected, marginalized groups), stakeholder validation plan included (when findings will be reviewed by program staff, beneficiaries, or partners), dissemination plan addresses post-evaluation communication.

4Good

At least three of four elements present. Consent and validation present; vulnerable population protections or dissemination plan partial.

3Adequate

At least two of four elements present. Consent referenced and one other element present. No detailed safeguarding or validation.

2Needs Improvement

Ethics referenced as a checkbox ("evaluators will follow ethical standards") without specification. No validation or dissemination plan.

1Inadequate

No ethics or stakeholder engagement provisions.

Score Interpretation

Total (out of 25)	Band	Next Step
22-25	Strong	ToR is procurement-ready. Minor refinements only.
17-21	Adequate	Address flagged dimensions before issuing for bids.
11-16	Needs Revision	Substantial revision required before procurement. Use Revise prompt with AI output as revision brief.
5-10	Substantial Revision	Do not issue. Return to evaluation team for full redraft.

Scoring Dimensions

1
Background and Scope Clarity
Whether the program is described specifically, the evaluation purpose is explicit, intended users are named, and the decision context is articulated.
2
Evaluation Questions and Criteria
Whether evaluation questions are specific and answerable, mapped to evaluation criteria, broken into sub-questions, and tied to the users who need each answer.
3
Methodology Specification
Whether the design type is named with rationale, sampling expectations are specified, data sources are listed, and the analysis approach is described.
4
Deliverables, Timeline, and Budget Realism
Whether deliverables, milestones, total cost, payment schedule, and communication plan are specified at a level of detail that allows credible bidding.
5
Ethics and Stakeholder Engagement Plan
Whether consent and safeguarding requirements are specified, vulnerable population protections are defined, validation with stakeholders is planned, and dissemination is addressed.

Each dimension scored 1–5. Maximum score: 25.

Back to Prompt Library