Library

Theory-Based Evaluation

An evaluation approach that tests whether a program's theory of change holds in practice, using process tracing and evidence-at-each-step reasoning rather than relying solely on counterfactual comparison. Strong alternative when RCTs or quasi-experimental designs are infeasible.

Theory-based evaluation tests whether the program's theory of change held up in practice. It uses evidence-at-each-step reasoning rather than relying solely on a counterfactual comparison.

What Theory-Based Evaluation Does

A theory-based evaluation walks the results chain from activity to outcome and tests each link with evidence. For every "if X then Y" claim in the theory of change, the evaluation asks three questions:

Did X actually happen, at the intended quality and intensity?
Was Y observed among the people the program reached?
Is there evidence the X-to-Y link operated, rather than the two events simply co-occurring?

Alternative explanations are then surfaced and systematically considered. Did another intervention also target the same outcome? Were participants already on an improving trajectory? Could the result reflect a measurement artifact rather than real change? A credible theory-based evaluation rules these out explicitly, not by assertion.

The output is a chain of reasoning, not a single effect estimate. Each link in the theory of change is rated on the strength of evidence supporting it, and the overall verdict is built up from those ratings.

Common Variants

Four variants dominate in practice:

Contribution analysis. The most common in development evaluation. Assesses how much the program contributed to outcomes that typically have multiple causes. Built for messy, multi-actor contexts.
Realist evaluation. Pawson and Tilley's context-mechanism-outcome framework. Asks "what works, for whom, in what circumstances, and why?" Useful when a program is expected to behave differently across settings.
Process tracing. Small-n, in-depth case analysis that tests specific causal claims with evidence tests (hoop tests, smoking-gun tests, doubly-decisive tests). Strong for single-case or few-case attribution questions.
Outcome harvesting. Starts from observed outcomes and traces back to establish program contribution. Useful when outcomes were not fully specified in advance.

When Theory-Based Works Best

Theory-based evaluation is the right fit when:

The program is complex and outcomes have multiple plausible causes.
A counterfactual is not feasible. No comparison group exists, randomization is ethically or politically impossible, or the program operates at a single site.
Sample sizes are small and statistical power is unachievable.
The theory of change is strong, but quantitative outcome measurement is limited.
The evaluation question is about mechanism and "how did this work," not just effect size.

Governance, advocacy, policy-change, and systems-strengthening programs are the canonical cases. So is any pilot at a single site.

Proposal Context

Propose theory-based evaluation when an RCT or matched quasi-experimental design is infeasible: governance programs, advocacy, single-site pilots, policy change efforts, and most systems work. Donor reception varies. Evaluation-literate donors (FCDO, IDRC, and several major foundations) accept theory-based approaches readily. Others default to expecting an impact evaluation with a counterfactual and need to be walked through why a counterfactual design is not feasible or not the right tool for the question.

Two proposal moves strengthen the case. First, name the specific method (contribution analysis, realist evaluation, process tracing) rather than writing "theory-based" as a generic label. Second, explain why a counterfactual is infeasible or inappropriate before introducing the alternative. Budget is typically in the same range as a quasi-experimental design, roughly $50-200k depending on scope, not dramatically cheaper. Proposals that pitch theory-based evaluation as a cost-saver read as weak.

Common Mistakes

Weak theory of change going in. Theory-based evaluation can only be as good as the theory it tests. If the logic model is a vague diagram with unstated assumptions, the evaluation will be vague too. Invest in the theory of change first.
Treating theory-based as a cheaper substitute for impact evaluation. It is not. It answers a different question (did the causal logic hold?) than an impact evaluation (what is the net effect?). Choose based on the decision you need to support, not the budget line.

Theory-Based Evaluation

Theory-based evaluation tests whether the program's theory of change held up in practice. It uses evidence-at-each-step reasoning rather than relying solely on a counterfactual comparison.

What Theory-Based Evaluation Does

Did X actually happen, at the intended quality and intensity?
Was Y observed among the people the program reached?
Is there evidence the X-to-Y link operated, rather than the two events simply co-occurring?

Common Variants

Four variants dominate in practice:

Contribution analysis. The most common in development evaluation. Assesses how much the program contributed to outcomes that typically have multiple causes. Built for messy, multi-actor contexts.
Realist evaluation. Pawson and Tilley's context-mechanism-outcome framework. Asks "what works, for whom, in what circumstances, and why?" Useful when a program is expected to behave differently across settings.
Process tracing. Small-n, in-depth case analysis that tests specific causal claims with evidence tests (hoop tests, smoking-gun tests, doubly-decisive tests). Strong for single-case or few-case attribution questions.
Outcome harvesting. Starts from observed outcomes and traces back to establish program contribution. Useful when outcomes were not fully specified in advance.

When Theory-Based Works Best

Theory-based evaluation is the right fit when:

The program is complex and outcomes have multiple plausible causes.
A counterfactual is not feasible. No comparison group exists, randomization is ethically or politically impossible, or the program operates at a single site.
Sample sizes are small and statistical power is unachievable.
The theory of change is strong, but quantitative outcome measurement is limited.
The evaluation question is about mechanism and "how did this work," not just effect size.

Governance, advocacy, policy-change, and systems-strengthening programs are the canonical cases. So is any pilot at a single site.

Proposal Context

Common Mistakes

Weak theory of change going in. Theory-based evaluation can only be as good as the theory it tests. If the logic model is a vague diagram with unstated assumptions, the evaluation will be vague too. Invest in the theory of change first.
Treating theory-based as a cheaper substitute for impact evaluation. It is not. It answers a different question (did the causal logic hold?) than an impact evaluation (what is the net effect?). Choose based on the decision you need to support, not the budget line.

Theory-Based Evaluation

What Theory-Based Evaluation Does

Common Variants

When Theory-Based Works Best

Proposal Context

Common Mistakes

Related Topics

Theory-Based Evaluation

What Theory-Based Evaluation Does

Common Variants

When Theory-Based Works Best

Proposal Context

Common Mistakes

Related Topics