Validity (Internal & External)

Definition

Validity refers to the accuracy and trustworthiness of conclusions drawn from evaluation data. It has two distinct dimensions that practitioners must consider separately:

Internal validity asks: Did the program actually cause the observed outcomes? This is about establishing credible causal inference - ruling out alternative explanations like selection bias, maturation, or external events that could have produced the same results. High internal validity means you can confidently attribute change to your intervention rather than confounding factors.

External validity asks: Can these findings be generalized beyond this specific study? This concerns the applicability of results to other contexts, populations, or time periods. A study with strong external validity produces insights that remain useful even when program conditions differ from the evaluation setting.

These dimensions often trade off against each other - tightly controlled studies maximize internal validity but may limit generalizability, while real-world implementations offer richer contextual insights at the cost of causal clarity.

Why It Matters

Validity is the foundation of credible M&E. Without it, you cannot distinguish program success from coincidence, nor can you learn lessons that apply beyond your specific case. Practitioners face validity concerns whenever they make causal claims - "our training improved skills" or "the intervention reduced dropout rates" - and these claims drive funding decisions, program adaptations, and organizational learning.

Poor validity leads to costly mistakes: scaling programs that don't work, abandoning interventions that do, or misallocating resources based on spurious correlations. Conversely, explicit attention to validity strengthens evaluation design, clarifies what can reasonably be claimed, and builds stakeholder confidence in findings. For impact evaluations and quasi-experimental designs, validity is the primary quality criterion - without it, the evaluation cannot fulfill its purpose.

In Practice

Threats to internal validity include:

Selection bias: comparison groups differ systematically before the intervention
History: external events coinciding with the program influence outcomes
Maturation: natural changes over time mistaken for program effects
Testing effects: pre-test exposure influences post-test responses
Instrumentation: measurement changes over time create artificial effects

Addressing these requires careful design: randomization (when feasible), matched comparison groups, pre-post measurements, and statistical controls for confounders.

Threats to external validity include:

Sample representativeness: study participants differ from target population
Contextual specificity: results depend on unique local conditions
Temporal limitations: findings apply only to specific time periods
Implementation fidelity: program delivered differently than intended

Strengthening external validity involves purposive sampling, documenting contextual conditions, testing across multiple sites, and being explicit about boundary conditions for generalization.

In impact evaluations (P15), internal validity is paramount - the study must establish causality before asking whether it generalizes. In quasi-experimental designs (P14), practitioners use techniques like propensity score matching or difference-in-differences to approximate randomization and strengthen causal claims. Throughout, data quality assessment ensures measurement reliability supports validity - unreliable data cannot be valid.

Definition

Why It Matters

In Practice

Related Topics

Validity (Internal & External)

Definition

Why It Matters

In Practice

Related Topics