What SMART Actually Tests
SMART is five quality tests for a candidate indicator. Any indicator that fails one of the five will cause a predictable problem downstream: inconsistent data collection, missing data sources, unreachable targets, data that does not inform a decision, or reporting cycles that do not match program cycles.
| Criterion | The test | Most common failure |
|---|---|---|
| Specific | Two people reading the indicator define it the same way | Vague language ("improved", "strengthened", "awareness") |
| Measurable | A feasible data source exists with a defined collection method | No identified source, or source requires capacity you do not have |
| Achievable | The target is realistic given resources, timeframe, and baseline | Target copied from proposal without baseline data |
| Relevant | The result ties to a decision, learning question, or reporting requirement | Indicator exists for compliance only, no one uses the data |
| Time-bound | The measurement interval and reporting deadlines are defined | No collection frequency stated; "ongoing" used as a placeholder |
The five criteria are not equally failure-prone. Specific and Measurable fail most often, Achievable and Time-bound fail in specific patterns, and Relevant is the hardest to catch because compliance indicators sneak in during proposal writing and never get pruned. Apply all five in order when designing new indicators, then again during MEL plan review before submission. See the smart indicators reference entry for the formal definition and indicator framework alignment.
Specific: The Unambiguity Test
Specific is the test that fails most often in M&E practice. The failure is usually not gross vagueness; it is near-vagueness. Phrases like "households with improved water access" or "youth with strengthened skills" pass casual review because they sound concrete, but they are ambiguous enough that two enumerators will classify households differently, or two analysts will report different results from the same dataset.
The test is concrete. Write the indicator down. Hand it to someone who was not in the design meeting. Ask them to describe exactly what data they would collect and how they would count a case. If their answer differs from what you intended, the indicator is not specific enough.
Three failure patterns:
Attribute ambiguity. "Improved water access" does not define what improved means. Is it proximity (distance to source), time (minutes to collect water), quality (safely managed under JMP definitions), or reliability (functional days per month)? Each is a different indicator and a different data collection exercise. Pick one and name it.
Population ambiguity. "Youth trained" does not define youth. The program targeting 15-24 year olds and the program targeting 18-35 year olds are both training youth, but their data are not comparable. State the age range explicitly, and state whether the count includes unique individuals or total person-trainings.
Unit ambiguity. "Households reached" does not state whether household counts once or each time program services are delivered. A household that attended three sessions could be counted as 3 or as 1. The indicator definition must name the unit of count.
A specific indicator reads like instructions an enumerator could follow without asking questions. If the indicator needs clarification in a footnote, the footnote content belongs in the indicator text.
Measurable: The Data Source Test
Measurable asks whether you can actually collect the data. The test has two parts: does a data source exist, and is the collection method feasible for this program?
Data source existence is the first filter. "Percentage of households feeling more secure" is measurable only if a household perception survey is planned. If the program has no budget or capacity to run a household survey, the indicator is unmeasurable regardless of how specific it is. You do not fix a missing data source at the analysis stage; you drop the indicator or add a data collection activity.
Collection method feasibility is the second filter. A quarterly household survey across 40 villages is measurable in principle but may not be feasible for a 24-month program with a $150,000 MEL budget. An indicator becomes measurable only when paired with a data collection method the program can actually execute. This usually means naming the source in the indicator specification: "as reported in the quarterly household survey" or "as tracked in the activity attendance roster."
Three categories of data source, ranked by reliability in M&E practice:
- Administrative or activity records (attendance, distribution logs, construction records): high reliability, low cost, but limited to program activities
- Program-run surveys and assessments (baseline, endline, DQA): medium reliability, medium cost, limited to what the program measures
- Secondary data (government statistics, DHS, MICS, facility records): variable reliability, low cost, but often out of date or not at the right granularity
Using two categories is more robust than using one. Triangulating an outcome indicator with both a survey estimate and an administrative record catches data quality problems that a single source would miss. See how to conduct a DQA for how data quality assessment applies to indicator measurability.
Achievable: The Feasibility Test
Achievable is a target-level test, not an indicator-level test. The indicator (the measurement) can be perfectly specified; the target (the value to reach) can still be unreachable.
The most common Achievable failure in M&E: a target written during proposal development without baseline data, then inherited into the MEL plan. A proposal may commit to "80% of households practicing safe water storage by end of project" without ever measuring the baseline. Once the baseline comes in at 22%, the 58-point improvement target is not achievable within the program's timeframe or budget. But the target is already in the proposal, the reporting requirement is already in the contract, and the program is set up to miss its target before data collection starts.
Three checks for Achievable:
Baseline-anchored target. The target should be expressed as a change from baseline, not as an absolute value disconnected from starting conditions. "Increase from 22% to 50%" is clearer than "reach 50%" because it acknowledges the starting point.
Rate of change consistent with prior evidence. If comparable programs produced 5-10 percentage point improvements in 24 months, a proposal claiming 30 points in 18 months is not achievable without substantially more resources. Check published impact evaluations for similar interventions in similar contexts before setting the target.
Budget and capacity check. The target may be achievable in principle but not with this budget. A 15% reduction in undernutrition requires specific program activities at specific doses. Compare the target to what the program is actually delivering; targets should not exceed what the program design can plausibly produce.
Achievable failures become politically difficult to fix after a proposal is signed. Set conservative, baseline-anchored targets from the start. See target setting for formula-based approaches.
Relevant: The Decision Test
Relevant asks whether anyone will use this indicator's data. If the answer is no, the indicator is costing money to collect without producing decisions, and it should be dropped.
Three patterns of irrelevance appear in most MEL plans:
Compliance-only indicators. The indicator was added because a donor template required it. No one internally will use the data. The program still has to collect, clean, and report it. These indicators are hard to drop but should be minimized: include the required field, collect at the minimum frequency, and do not invest capacity in analysis.
Orphan indicators. The indicator was included to measure an activity that was later cut or de-emphasized, but no one updated the MEL plan. The indicator is still being reported on, but the activity it describes is no longer meaningfully part of the program. Run an annual MEL plan review and drop orphans.
Curiosity indicators. The indicator was added because the designer was curious about the measurement, not because it informs a decision. Curiosity indicators have no reporting destination and no decision rule attached. Drop them unless the learning agenda explicitly requires them.
The Relevant test, applied to each indicator: "Who will read this data, and what decision will they make differently because of it?" If the answer is "no one" or "unclear," the indicator fails Relevant. Most MEL plans have 20-40% indicators that fail this test; pruning them saves money and frees up capacity for indicators that actually drive decisions.
Time-bound: The Interval Test
Time-bound asks whether the measurement interval is defined and whether it aligns with the reporting and decision cycles the indicator is supposed to feed.
The standard Time-bound failure is not missing timeframes; it is timeframes that do not match the decision cycle. A quarterly coverage indicator that only reports at midline and endline cannot inform quarterly adaptive management meetings. An annual outcome indicator that is meant to feed a mid-year donor report will miss the report deadline.
Two checks for Time-bound:
Interval specified. "Annually," "quarterly," "at baseline, midline, and endline" are all specified. "Ongoing" and "as needed" are not. If the indicator description says the data will be collected as needed, it means no one has decided the schedule, and the data collection will drift.
Interval matches the use. The measurement frequency should be no less often than the reporting or decision cycle. If the program holds quarterly steering committee meetings that use this indicator, measure quarterly. If the indicator only feeds an annual donor report, annual is sufficient. Measuring more frequently than needed is a cost; measuring less frequently renders the data unusable at decision time.
See baseline and related measurement-timing entries for how to coordinate indicator timing with baseline/midline/endline surveys.
Six Worked Revisions
Before-and-after examples of SMART failures and fixes.
1. WASH coverage indicator, East Africa
Before: Number of people with improved access to water. Fails: Specific (what does "improved" mean?), Measurable (no source named), Time-bound (no interval). After: Number of people using a safely managed drinking water source (JMP definition), as measured in the annual household survey conducted in September of each project year.
2. Education outcome indicator, South Asia
Before: Increased learning outcomes among program participants. Fails: Specific (which learning? what unit?), Measurable (no assessment specified), Achievable (no baseline or target). After: Percentage of Grade 3 students scoring at or above the national benchmark on the year-end literacy assessment, measured annually in May. Baseline (Year 0): 48%. Target (Year 3): 62%.
3. Livelihoods output indicator, West Africa
Before: Beneficiaries receiving livelihoods training. Fails: Specific (beneficiaries in what category? what training?), Relevant (counts people, not skills or income change). After: Number of adult women (18-49) who complete all five sessions of the vegetable-production training module, as recorded in the training attendance roster. Counted once per unique participant across the project period.
4. Health service delivery indicator, Central America
Before: Strengthened capacity of health workers. Fails: Specific (strengthened how?), Measurable (no source), Achievable (no target), Time-bound (no interval). After: Percentage of trained community health workers demonstrating proficiency on the 12-item post-training skills checklist, assessed at the end of each training cohort. Target: 80% of trained CHWs per cohort.
5. Gender outcome indicator, Southern Africa
Before: Improved gender equality in program communities. Fails: Specific (gender equality in what domain?), Measurable (no source), Relevant (unclear who acts on the data). After: Percentage of program women reporting participation in household decisions on child education and family expenditure in the past month, measured in the annual gender-focused household survey conducted in November.
6. Food security outcome indicator, Sahel
Before: Reduced food insecurity among target households. Fails: Specific (which measure of food insecurity?), Time-bound (no interval), Achievable (no target). After: Average Household Hunger Scale (HHS) score among program households, measured quarterly. Baseline Year 0: 2.8 (moderate). Target Year 2: 1.5 (little to none).
The pattern across all six: the before version reads fine in a proposal narrative, fails at least two of the five SMART tests, and would produce inconsistent or unusable data. The after version is longer but collectable by any trained enumerator, comparable across time periods, and tied to a specific decision.
Common Mistakes
Mistake 1: Treating SMART as a grading scale instead of a gate. Some teams score indicators on SMART (e.g., 4 out of 5) and keep the ones that pass most criteria. Wrong approach: an indicator that fails any one criterion will fail in practice. Treat SMART as a gate, not a score.
Mistake 2: Passing Specific without the colleague test. Reading your own indicator and finding it clear is not evidence it passes Specific. You know what you meant. A colleague who was not in the design meeting should be able to read the indicator and collect the data without clarification.
Mistake 3: Declaring Measurable without identifying the data source. An indicator is measurable only if a specific source and method are named. "Household survey" is not enough; "the quarterly household survey conducted by the M&E team in September, using the KAP module" is.
Mistake 4: Copying targets from proposals without baseline data. Proposal targets written before baseline data are often unreachable. Refuse to inherit targets into the MEL plan without a baseline-adjusted review.
Mistake 5: Keeping compliance indicators past their useful life. Compliance-only indicators stay in MEL plans long after the compliance requirement is gone. Review the full indicator set annually and prune the ones nobody uses.
Mistake 6: Using "ongoing" as a measurement interval. "Ongoing" means no one has scheduled the data collection. It always drifts. Pick a specific frequency, even if it is rare.
Mistake 7: Treating SMART as a one-time exercise. SMART should be re-applied whenever program design changes, activities are cut, or data sources become unavailable. An indicator that was SMART at design stage can fail 18 months later if the data source was discontinued. See mistake too many indicators for the companion pruning discipline.
Mistake 8: Writing indicators before decisions. The Relevant test should come first in practice. Identify the decisions and reporting requirements first, then design indicators to feed them. Writing indicators then looking for decisions to attach is the root cause of irrelevance failure.
SMART Review Checklist
Apply to every candidate indicator before it enters the MEL plan or proposal.
Specific:
- A colleague who was not in the design meeting can read the indicator and describe the data to collect
- The attribute being measured is named (quality, quantity, status) without ambiguity
- The population is defined (age range, sex, geography, program group)
- The unit of count is stated (unique individuals vs person-events, household vs individual)
Measurable:
- A specific data source is named
- The collection method is named (survey, roster, record review)
- The frequency is feasible given the program budget and capacity
- At least one data quality control is specified (enumerator training, supervisor spot checks, DQA cycle)
Achievable:
- Baseline value is known or a baseline study is planned before the target is finalized
- Target is stated as a change from baseline, not an absolute value disconnected from starting conditions
- Rate of change is consistent with published evidence from comparable programs
- Budget and program design can plausibly produce the targeted change
Relevant:
- The decision or reporting requirement the indicator feeds is named
- A person or committee has accountability for acting on the data
- The indicator is not a duplicate of another indicator already in the plan
Time-bound:
- Collection frequency is stated (not "ongoing" or "as needed")
- Reporting deadline is stated
- Measurement interval matches the decision or reporting cycle it feeds
Run the SMART Indicator Checker to validate each indicator against the full criteria before MEL plan submission. For the broader indicator design workflow, see indicator vs target vs milestone and mistake too many indicators. For an AI-assisted step-by-step workflow, see the Indicator Development playbook.