What AI Does Well and Poorly
AI is a drafting tool, not a reasoning tool. It structures information well when the input is strong; it produces plausible-looking but generic output when the input is weak. The split of what AI handles well versus poorly in logframe writing:
| AI does well | AI does poorly |
|---|---|
| Structure the four-by-four matrix from unstructured program narrative | Test whether the vertical logic actually holds |
| Generate indicator candidates matching a sector or outcome description | Ground indicator specificity in local program context |
| Draft assumptions in the format of "if X then Y" statements | Identify which assumptions are genuinely critical |
| Align output to donor template structure (USAID PMP, FCDO logframe, EU logframe) | Verify feasibility of means of verification against actual budget |
| Produce consistent language across rows | Catch confusion between outputs and outcomes |
| Suggest disaggregation variables for standard demographic breakdowns | Recommend sample sizes (see how to choose sample size) |
The pattern: AI handles structure, volume, and consistency well. It handles judgment, local context, and feasibility poorly. Use AI for drafting; use human review for validation. See how to write a logframe for the manual workflow this AI workflow replaces.
The Five-Step Workflow
Drafting a logframe with AI assistance takes roughly 1-2 hours when followed properly. Each step has a specific output.
| # | Step | Typical time | Output |
|---|---|---|---|
| 1 | Assemble context package | 15-30 min | Packaged inputs: theory of change, donor template, program description, indicator library constraints |
| 2 | Draft with strong prompting | 15-20 min | AI-generated logframe draft covering all four rows and four columns |
| 3 | Test the vertical logic | 15-20 min | Revised logframe with any broken causal links fixed |
| 4 | SMART-check every indicator | 20-30 min | Each AI-suggested indicator validated or replaced |
| 5 | Verify MoV feasibility | 15-20 min | Means of verification that the actual budget supports |
Skipping steps 3-5 is the most common failure mode. Teams ship AI-drafted logframes directly to proposals without validation, and reviewers catch the generic output, fabricated specifics, or broken logic. The review time is the difference between a defensible logframe and one that fails review.
Step 1: Assemble the Context Package
AI output quality is determined more by input quality than by which AI tool you use. Five documents make up the context package.
Theory of change (required). The finalized theory of change is the single most important input. Long-term outcome, intermediate outcomes, outputs, activities, and assumptions. See how to write a theory of change. If your theory of change is a one-paragraph sketch, the AI will fill gaps with plausible-looking generic content. If your theory of change is 3-5 pages with assumptions and evidence, the AI operationalizes that structure into a coherent logframe.
Donor logframe template. The structural container AI will populate. USAID PMP has specific PIRS requirements; FCDO uses a different logframe structure; EU uses its own. Hand the AI the exact template you will submit against. If you do not have a template, a generic four-by-four matrix with columns (narrative summary, indicators, means of verification, assumptions) works.
Program description. 1-2 page summary of what the program does, where, for whom, at what scale. AI produces more specific output when grounded in specifics.
Indicator library constraints (optional). If you are using donor-standard indicators (USAID F, PEPFAR MER, GEF Core, JMP WASH, etc.) or internal organizational indicators, provide the relevant subset. This prevents AI from inventing indicators and keeps output aligned to your frameworks. See custom vs standard indicators.
Budget and scope constraints (optional). If you know the M&E budget, target sample size, or data collection frequency constraints, include them. This shapes the means of verification the AI suggests.
Step 2: Draft with Strong Prompting
Paste the context package and prompt the AI for a logframe. Three elements in a strong prompt.
Clear task framing. "Draft a logframe for this program aligned to the [donor] template, using the theory of change, program description, and indicator library provided. Work from long-term outcome backward (goal, outcomes, outputs, activities), then fill in indicators, means of verification, and assumptions for each row."
Explicit constraints.
- "Do not invent baseline values, targets, or specific statistics. Use placeholder text like
[baseline: TBD]for any numerical fields." - "Do not cite specific studies or research that is not in the context provided."
- "For each assumption, state it in testable form: a specific condition that can be checked during implementation."
- "For each means of verification, name a specific data source and method, not 'survey' or 'monitoring data'."
Output format. Specify the exact format you want: markdown table, structured text, or a specific template. If the donor template is a .docx, describe its structure to the AI and ask for output matching that structure.
What you get back is a first-draft logframe. It will have useful structure and suspect specifics. Step 3 onward is the validation work.
Step 3: Test the Vertical Logic
The vertical logic test asks: does each row plausibly produce the row above it? AI often generates logframes that look complete but have broken vertical logic, especially at the output-to-outcome link.
Walk the draft matrix bottom-up:
- Do the activities produce the outputs as written?
- Do the outputs plausibly produce the outcomes? (This is where AI most commonly fails: outputs and outcomes confused, or outputs too disconnected from outcomes)
- Do the outcomes plausibly contribute to the goal?
Common AI-generated logic failures:
- Outputs labeled as outcomes ("Number of women trained" in the outcome row, when it belongs in the output row)
- Outcomes stated as aggregate outputs ("5,000 women reached" in the outcome row)
- Goal disconnected from the intermediate outcomes (goal at population level but outcomes at individual program-participant level, with no bridge)
- Activities that do not sum to the outputs they are meant to produce
Where logic breaks, rewrite manually. Do not prompt the AI to fix it without first articulating what is broken; AI will often produce a second draft with the same structural error in slightly different wording.
Step 4: SMART-Check Every Indicator
AI-generated indicators vary from good to unusable. Each indicator needs to pass the SMART test before acceptance. See SMART indicators deep-dive for the full criteria.
Five minutes per indicator. For each:
- Specific: Could two enumerators read this and collect the data the same way?
- Measurable: Is there a feasible data source and method?
- Achievable: Can this program plausibly produce the target change? (If AI inserted specific targets, these need independent validation from baseline data, not acceptance.)
- Relevant: Does the indicator tie to a decision or reporting requirement, or is it filler?
- Time-bound: Is measurement interval defined?
Common AI-suggested indicator problems:
- Vague verbs ("improved", "strengthened", "enhanced") without operational definitions
- Targets or baselines invented from thin air
- Indicators sourced to generic "surveys" or "monitoring systems" without specificity
- Duplicate or near-duplicate indicators at different levels
Use the SMART Indicator Checker tool to validate each indicator systematically. Revise, replace, or drop any that fail.
Step 5: Verify MoV Feasibility
Means of verification is where AI output most frequently exceeds what the program can execute. AI produces feasible-sounding MoV (household survey, administrative records, third-party data) without regard to actual budget or capacity.
For each means of verification in the draft:
- Is there budget for this data collection (see data collection budget)?
- Is there staff capacity to execute it at the specified frequency?
- Is the source actually available? (AI sometimes suggests data sources that do not exist for a given context.)
- Does the frequency match the decision cycle it feeds?
Where MoV is not feasible, either reduce the indicator (less frequent measurement, smaller sample, simpler method) or drop the indicator entirely. Do not ship a logframe with aspirational MoV that the program cannot execute. See means of verification for feasibility criteria.
Integrating MEStudio Tools
Three MEStudio tools support the AI logframe workflow:
Indicator Library (/resources/indicators). Filter to your sector, scan indicators aligned to donor frameworks, and use those as the input to AI indicator-suggestion prompts. Prevents AI from inventing indicators when validated alternatives exist.
Logic Model Builder (/tools/logic-model). Structure your theory of change into the Builder's format, export, and feed that structured output to AI. Produces more coherent logframes than free-text theory-of-change input.
SMART Indicator Checker (/tools/smart-indicator-checker). Run every AI-suggested indicator through this tool before acceptance. Catches the vague-verb and missing-definition patterns AI commonly produces.
For the broader proposal workflow, see how to write the M&E proposal section. For the underlying MEL plan structure, see how to write a MEL plan.
Sector Examples
Health: HIV prevention program proposal, East Africa
A health program proposal required a PEPFAR MER-aligned logframe. The MEL team used Claude with a context package of: a 4-page theory of change, the PEPFAR MER indicator subset for HIV prevention, the program description, and the draft budget. AI produced a first-draft logframe in 18 minutes. Validation identified three issues: two PEPFAR MER indicators were misassigned to the output row instead of outcome row; one AI-suggested custom indicator had a vague definition ("community acceptance of HIV testing"); the MoV for one outcome indicator assumed a quarterly population survey that the budget could not support. Revision took 40 minutes. Final logframe had 14 indicators (9 PEPFAR MER standard, 5 custom), passed internal review with one minor edit, and was submitted on time.
Education: Girls' education program, South Asia
An education proposal required a logframe aligned to FCDO's structure plus INEE standards. The MEL lead fed ChatGPT a theory of change, the FCDO template, and a program description focused on mentoring plus community engagement. The first draft had a structurally sound four-by-four matrix but three generic indicators ("community attitudes", "parental support") without operational definitions. Validation replaced those with three custom indicators operationalizing specific behaviors (mothers discussing education with daughters in last month; community meetings including girls' education as agenda topic; fathers attending school events). Total drafting + validation time: 90 minutes. Submitted draft passed donor review on the first round.
WASH: Community water sustainability proposal, West Africa
A WASH proposal drafted its logframe using Gemini with inputs of a theory of change, JMP ladder definitions, the program description, and the donor's logframe template. First draft included JMP-aligned outcome indicators appropriately. Validation caught two AI-invented specifics: a baseline value of "35% safely managed water access" that had no source (actual baseline was to be collected in program year 1, and the AI had filled a plausible but fictional value), and a reference to a country-level national household survey that did not exist at the implied frequency. Both specifics were replaced with placeholder notes. Total time: 75 minutes.
Food security: Pastoralist livelihoods proposal, Sahel
A food security proposal needed a logframe with seasonal disaggregation (transhumance vs settled) built into outcome indicators. The MEL team provided Claude with a theory of change, HFIAS and FCS indicator definitions, and a program description specifying seasonal data collection. First draft handled structure well but missed the seasonal disaggregation in most indicators. Validation added disaggregation to 6 of 10 indicators and replaced 2 AI-suggested indicators that did not match pastoralist context. Total: 85 minutes.
Common Mistakes
Mistake 1: Skipping theory of change work and feeding AI a thin program description. The AI will produce a plausible-looking generic logframe. Review will surface the gaps. Theory of change is the single most important input; write it before drafting.
Mistake 2: Accepting AI-invented baseline values and targets. AI fills numerical gaps with plausible-sounding numbers that have no source. Every baseline, target, and specific statistic needs to be from your data or clearly marked as placeholder.
Mistake 3: Shipping AI output without SMART validation. AI suggests many indicators that fail SMART criteria. Every indicator needs five minutes of validation before acceptance.
Mistake 4: Accepting vague means of verification. AI often produces "survey" or "monitoring data" as MoV. These need to be replaced with specific sources, methods, frequencies, and responsible parties before the logframe is usable.
Mistake 5: Not testing vertical logic. AI output can look coherent at the surface but have broken vertical logic. Walk the matrix bottom-up to catch logic failures.
Mistake 6: Using AI to generate the theory of change as well as the logframe. AI produces bland theory-of-change drafts that look reasonable but do not reflect your program's specific logic. The theory of change should be human-written; the logframe can be AI-drafted.
Mistake 7: Letting AI invent donor compliance requirements. If the donor has a specific logframe template, PIRS format, or mandatory indicator list, provide them to the AI. Do not let AI guess at donor conventions.
Mistake 8: Treating AI output as finished rather than a first draft. Two hours of careful validation produces a defensible logframe. Submitting AI-drafted logframes directly is the most common failure mode.
AI Logframe Review Checklist
Run through this before treating an AI-drafted logframe as final.
Context package:
- Finalized theory of change provided to AI
- Donor template structure provided
- Program description specific to this program provided
- Indicator library constraints provided (if using standard indicators)
Prompt quality:
- AI instructed not to invent baseline values or specific statistics
- AI instructed to write testable assumptions
- AI instructed to specify means of verification, not generic "survey"
Output validation:
- Vertical logic tested bottom-up (each row plausibly produces the row above)
- Every indicator passed SMART validation (see SMART indicators deep-dive)
- Outputs and outcomes not confused
- Every means of verification is feasible against the actual budget
Compliance and accuracy:
- Donor-required indicators included unmodified
- No fabricated statistics or baseline values
- Placeholders clearly marked where data will be collected later
- Logframe aligns with the theory of change and the narrative proposal
Integration:
- Indicators cross-referenced against Indicator Library where applicable
- Validated through SMART Indicator Checker
- Reviewed against the how to write a logframe checklist
For the full proposal workflow, see how to write the M&E proposal section. For the accompanying theory of change, see how to write a theory of change. For the MEL plan this logframe operationalizes, see how to write a MEL plan. For an AI-assisted step-by-step workflow, see the Theory of Change playbook.