When to Use
Realist evaluation is the right approach when the question is not simply "did the programme work?" but "for whom did it work, under what conditions, and through what mechanisms?" Developed by Ray Pawson and Nick Tilley in the 1990s, realist evaluation is built on the insight that programmes do not cause outcomes directly, they introduce resources and opportunities that trigger responses in specific people in specific contexts.
Use it when:
- Outcomes vary across sites or populations: the programme shows strong results in some places and weak results in others, and you need to understand why
- Context is central: the programme works through relationships, norms, or institutional conditions that differ meaningfully across settings
- Theory refinement is the goal: you want to understand why a programme works in order to improve it, not just whether it works on average
- Scale-up decisions require specificity: before expanding a programme, funders and managers need to know which contexts are necessary for the mechanisms to fire
- Existing evidence is mixed: realist synthesis (the literature-based version) can reconcile conflicting findings from multiple evaluations of similar interventions
Realist evaluation is resource-intensive and produces probabilistic, context-specific findings rather than average treatment effects. It is not suitable when funders need a single yes/no effectiveness verdict, when resources are limited, or when the programme theory is very simple and context is relatively uniform.
| Scenario | Use Realist Evaluation? | Better Alternative |
|---|---|---|
| Why does it work for some and not others? | Yes | — |
| Average effect across all contexts | No | Impact Evaluation |
| Simple, uniform intervention | No | RCT or QED |
| Building causal argument without mechanism | No | Contribution Analysis |
| Scale-up context specification | Yes | — |
| Literature synthesis of mixed evidence | Yes (realist synthesis) | — |
How It Works
Realist evaluation is built around one central analytical unit: the Context-Mechanism-Outcome (CMO) configuration. A CMO configuration states: in this context (C), this mechanism (M) is triggered, producing this outcome (O).
- Context: the conditions (social, institutional, cultural, geographic, historical) within which a programme operates. Context is not just background; it activates or suppresses mechanisms
- Mechanism: the causal process that connects a programme resource or activity to an outcome. Mechanisms are typically hidden, they involve how people reason and respond to programme inputs
- Outcome: the observable change that results when a mechanism fires in a given context
Step 1: Develop initial programme theory (IPT)
Start with an explicit theory of how the programme is supposed to work. This is not just a logic model, it must articulate the mechanisms through which resources are expected to change behaviour.
Step 2: Generate CMO hypotheses
Translate the programme theory into a set of testable CMO configurations. For example: "When community health workers are respected figures in their community (C), free bednet provision triggers social norm activation around child protection (M), producing improved consistent bednet use (O)."
Step 3: Collect data to test the CMOs
Mixed methods are typically required. Quantitative data can test whether outcomes varied by context. Qualitative data (interviews, observations) can probe the mechanisms.
Step 4: Analyse CMO configurations
Examine which CMO configurations were confirmed, partially confirmed, or disconfirmed by the data. Where mechanisms did not fire as expected, identify what contextual factor suppressed them.
Step 5: Refine the programme theory
Revise the initial programme theory based on empirical findings. Realist evaluation is iterative, the theory improves with each cycle of hypothesis testing.
Step 6: Produce middle-range theory
Synthesise findings into transferable, middle-range theories that specify the conditions under which this type of intervention produces these types of outcomes. These are more useful for decision-making than context-specific findings alone.
Key Components
- Initial programme theory: explicit causal logic articulating mechanisms, not just input-output chains
- CMO configurations: testable hypotheses linking context, mechanism, and outcome
- Context mapping: systematic documentation of the contextual factors relevant to mechanism activation
- Mixed methods data collection: quantitative to test outcome variation by context; qualitative to probe mechanisms
- Iterative theory refinement: repeated cycles of hypothesis testing and theory revision
- Middle-range theory: transferable propositions about what works for whom under what conditions
- Realist-trained evaluators: this approach requires specialist knowledge to implement credibly
Best Practices
Articulate mechanisms explicitly. The most common failure in realist evaluation is treating mechanisms as black boxes. A mechanism statement must name the response that is triggered: "Women participate in savings groups (M: social trust and reciprocal obligation) when neighbours they already know are members (C), producing improved financial resilience (O)."
Monitor context throughout implementation. Context changes during implementation, political shifts, market fluctuations, leadership changes. Build context monitoring into the evaluation design.
Use theory to guide data collection, not data to generate theory. Realist evaluation starts deductively with CMO hypotheses and tests them, it is not grounded theory. Starting with data and inductively generating CMOs produces poorly specified findings.
Strengthen plausibility with existing evidence. Before testing CMO configurations empirically, review the literature for evidence that the proposed mechanisms operate in similar contexts.
Report negative cases. CMO configurations that were disconfirmed are as analytically important as those that were confirmed. Report both.
Common Mistakes
Treating "context" as confounders to control away. In realist evaluation, context is not noise, it is explanatory. Controlling for context in a regression model destroys the analytical value of contextual variation.
Listing characteristics instead of specifying mechanisms. Saying "the programme worked in urban contexts" is a contextual observation, not a realist finding. A realist finding explains why, what mechanism urban context activates or enables.
Using realist vocabulary without realist reasoning. Programmes sometimes describe their evaluation as "realist" because they collected qualitative data alongside a survey. Realist evaluation requires explicit CMO hypothesis development, iterative theory refinement, and systematic cross-case comparison.
Designing without sufficient qualitative depth. Mechanisms are not directly observable in outcome data. You need interviews, observations, or documents that reveal how people responded to programme inputs and why. Superficial qualitative data produces superficial mechanism specification.
Claiming generalisability prematurely. Middle-range theories from a single realist evaluation are hypotheses, not laws. Replication across multiple contexts is needed before transferability can be claimed.
Examples
Community health, East Africa. A realist evaluation of a community health worker (CHW) programme in Kenya identified three CMO configurations from the initial programme theory. The primary configuration, that CHWs embedded in community structures (C) would trigger help-seeking behaviour through social trust (M), was confirmed in rural areas where CHWs were elected by their communities but disconfirmed in peri-urban areas where CHWs were centrally assigned. A secondary configuration about maternal health knowledge was confirmed across all contexts. These findings informed a redesign of the CHW selection process for the programme's second phase.
Cash transfers, West Africa. A realist evaluation of a conditional cash transfer programme in Niger found that the same transfer amount produced very different nutritional outcomes across regions. The mechanism analysis revealed that in markets with functioning grain supply chains (C), the cash trigger activated commercial food purchasing (M) and produced dietary diversity improvements (O). In remote areas with thin markets, the mechanism did not fire because cash could not be exchanged for diverse foods. The finding shaped the geographic targeting strategy for scale-up.
Education governance, South Asia. A realist synthesis of 23 evaluations of school governance reform programmes in South Asia identified that reforms producing learning improvements shared one CMO configuration: when local government had prior capacity and community trust (C), school management committee formation (M: shared accountability) produced teacher attendance improvements and learning gains (O). Reforms in low-capacity settings produced the governance structures without activating the accountability mechanism.
Compared To
| Method | Causal Logic | Counterfactual | Primary Output |
|---|---|---|---|
| Realist Evaluation | Generative (mechanisms) | None | Middle-range theory |
| Impact Evaluation | Successionist (regularity) | Explicit | Average treatment effect |
| Process Tracing | Mechanism tracing | None | Causal chain evidence |
| Contribution Analysis | Plausible contribution | None | Contribution story |
| Developmental Evaluation | Emergent | None | Real-time learning |
Relevant Indicators
18 indicators across DFID, UNDP, and OECD-DAC frameworks. Key examples:
- Number of CMO configurations initially hypothesised versus confirmed by evaluation data
- Degree to which evaluation explains outcome variation across implementation contexts (rated 1-5)
- Proportion of evaluation recommendations that specify the context conditions necessary for replication
Related Tools
- Evaluation Planner: structure your CMO hypothesis development and data collection plan
- MEStudio Logic Model Builder: for building the initial programme theory that underpins CMO analysis
Related Topics
- Process Tracing, a complementary method for tracing causal mechanisms within individual cases
- Contribution Analysis, an alternative for building causal arguments without experimental design
- Mixed Methods Evaluation, realist evaluation typically requires mixed methods to test CMO configurations
- Theory of Change, the programme theory that generates the initial CMO hypotheses
- Developmental Evaluation, an alternative for highly emergent programmes where CMOs cannot be pre-specified
Further Reading
- Pawson, R. & Tilley, N. (1997). Realistic Evaluation. London: Sage. The foundational text.
- Pawson, R. (2006). Evidence-Based Policy: A Realist Perspective. London: Sage. Extends to realist synthesis.
- Blamey, A. & Mackenzie, M. (2007). "Theories of Change and Realistic Evaluation." Evaluation, 13(4), 439-455. Comparison with other theory-based approaches.
- Wong, G., Greenhalgh, T., Westhorp, G., & Pawson, R. (2012). "RAMESES Publication Standards: Realist Syntheses." BMC Medicine, 10, 21. Standards for realist synthesis reporting.