Design a Baseline Survey with AI
A 5-step prompt workflow that takes you from indicators to a field-ready survey instrument with sampling frame, skip logic, and pilot protocol.
What you'll build
A field-ready survey instrument with indicator-linked questions, skip logic, sampling frame, and pilot testing protocol.
Before you start
- Your indicator matrix or logframe with indicators defined
- Target population description (who, where, how many)
- Data collection constraints (budget, access, timeline)
- Donor methodology requirements if applicable (e.g., BHA requires Food Consumption Score)
Start by mapping each indicator to the specific survey questions needed to measure it. This prevents the common mistake of writing questions first and trying to fit indicators afterward.
You are a senior M&E survey specialist. I need to design a baseline survey. Start by mapping my indicators to survey questions. For each indicator I provide, generate: - 1-3 survey questions that directly measure this indicator - Response format for each question (multiple choice, Likert scale, numeric, open-ended) - Which validated instrument or standard methodology the question draws from (if applicable) - Any skip logic dependencies (e.g., "only ask if respondent answered X to question Y") Present as a table with columns: Indicator, Question Text, Response Format, Source/Methodology, Skip Logic. Keep the total number of questions under 40 for household surveys, under 25 for individual surveys. Flag when I am approaching the limit. My indicators are: [List your indicators here, or paste your indicator matrix]
Every question must trace back to an indicator. If a question does not measure an indicator, it does not belong in the survey. Cut it.
Refine the draft questions for clarity, cultural appropriateness, and respondent burden. Poor question wording is the most common source of bad survey data.
Review all the survey questions from the previous step and improve them. For each question: 1. Simplify the language to a primary education reading level 2. Remove double-barreled questions (asking two things at once) 3. Remove leading questions (questions that suggest the "right" answer) 4. Ensure response options are mutually exclusive and collectively exhaustive 5. Add "Don't know" and "Refuse to answer" options where appropriate 6. Flag questions that may be culturally sensitive and suggest alternative phrasings Then organize the questions into logical sections (demographics, then topic sections ordered from least to most sensitive). Add section headers and transition statements. Output the full revised questionnaire with section headers, question numbers, question text, and response options.
Read every question aloud. If it sounds awkward spoken, it will confuse respondents. Enumerators read questions out loud.
Define the conditional logic that routes respondents through the survey based on their answers. This keeps the survey short for each respondent while collecting complete data.
For the survey instrument, define all skip logic and routing rules. Present as a routing table with columns: - Trigger question (question number and response that activates the skip) - Action (skip to question X, end section, or end survey) - Affected questions (which questions are skipped) - Reason (why this skip exists) Then create a visual flow summary showing the main routing paths through the survey. How many questions does each respondent type answer? Also flag any logic conflicts (e.g., circular skips, unreachable questions, questions that depend on skipped questions).
Test the skip logic by walking through the survey as 3 different respondent types. If any path leads to a dead end or a nonsensical question, the logic has a bug.
Define who you will survey, how many, and how you will select them. The sampling strategy determines whether your findings can be generalized.
Design a sampling strategy for this baseline survey. Produce: 1. **Target population**: Who is eligible to be surveyed? Define inclusion and exclusion criteria precisely. 2. **Sample size calculation**: Recommend a sample size with assumptions stated (confidence level, margin of error, design effect if clustered, expected response rate). Show the calculation. 3. **Sampling method**: Recommend the most appropriate method (simple random, stratified, cluster, multi-stage) and explain why. If cluster sampling, state the assumed design effect and ICC. 4. **Sampling procedure**: Step-by-step instructions for how enumerators will select respondents in the field. Be specific enough that someone could follow these instructions without additional guidance. 5. **Stratification**: If stratified, define the strata and the allocation per stratum. 6. **Replacement protocol**: What to do when a selected respondent is unavailable, refuses, or is ineligible. Present sample size as a table showing the calculation with all assumptions visible.
Always verify the sample size calculation independently using a sampling calculator or the Cochran formula directly. AI models sometimes make arithmetic errors. Also: if you need to disaggregate by subgroup, your sample size must be large enough for each subgroup separately.
Plan how to test the survey before full deployment. Piloting catches problems that desk review cannot: confusing questions, incorrect skip logic, timing issues, and translation errors.
Create a pilot testing protocol for this survey. Include: 1. **Pilot sample**: How many respondents, from where, and how selected. The pilot sample should NOT be drawn from the final survey population if possible. 2. **What to test**: A checklist of specific things to evaluate during the pilot: - Average completion time - Questions that cause confusion (respondents ask for clarification) - Questions with high "Don't know" or refusal rates - Skip logic errors - Translation accuracy (if applicable) - Sensitive questions that cause discomfort - Response option gaps (respondents give answers not listed) 3. **Debrief protocol**: Questions to ask enumerators after the pilot about what worked and what did not. 4. **Revision criteria**: What threshold of problems triggers a question revision vs. removal? (e.g., "if >20% of respondents misunderstand a question, revise it; if >40%, remove it") 5. **Timeline**: How long the pilot takes and how long revisions take before full deployment.
Never skip the pilot. The cost of piloting (2-3 days, 20-30 respondents) is trivial compared to the cost of collecting bad data from your entire sample.
Use MEStudio's scoring rubric to check the quality of what you just built. Send this prompt in the same conversation to get a scored assessment with specific revision suggestions.
Open the scoring rubricIf any dimension scores below 4, go back to the relevant step and ask the AI to strengthen that section. The rubric tells you exactly what to fix.
Not sure which AI tool to use?
Try the AI Tool Selector to find the best tool for your specific M&E task, or browse 130+ M&E-specific prompts.