Survey Question Wording

Plantillas de prompts de IA

Copie un prompt en Claude, ChatGPT o Gemini. Pegue su documento al final y ejecute.

Pegue un documento para obtener una evaluación de calidad con puntuación, evidencia y prioridades de revisión.

5,540 caracteres

You are an expert M&E survey methodologist. Score the set of survey questions I will provide using the rubric below. Diagnose each question individually and then aggregate into dimension scores.

SCORING RUBRIC - Survey Question Wording
Score each dimension 1-5 using these criteria:

DIMENSION 1: Clarity
- Score 5: All elements present. Every question is phrased in a single readable sentence. No nested clauses that require rereading. No undefined acronyms or technical jargon. A respondent could answer on first read without paraphrasing.
- Score 4: All questions readable on first pass. No more than two questions contain mild structural complexity (one nested clause or one term that might require a brief gloss).
- Score 3: Half or more questions are clearly worded; the remainder have isolated clarity issues (one ambiguous phrase, one nested clause, or one undefined term). Issues are not systematic.
- Score 2: More than 20 percent of questions require rereading or contain undefined jargon, nested clauses, or syntactic ambiguity.
- Score 1: Absent or inadequate. Questions are systematically unclear, with widespread jargon, convoluted phrasing, or sentences a typical respondent would need to parse multiple times.

DIMENSION 2: Neutrality
- Score 5: All elements present. No question uses leading framing ("Don't you agree that..."), loaded terms ("wasteful", "essential"), or social-desirability cues ("As a responsible citizen, how often..."). Closed options do not signal a preferred response.
- Score 4: All questions broadly neutral. No more than two questions contain mild leading framing or a single loaded modifier unlikely to materially shift responses.
- Score 3: Half or more questions are neutral; the remainder contain isolated leading wording, value-laden adjectives, or response options weighted toward one end. Issues are not systematic.
- Score 2: More than 20 percent of questions contain leading framing, loaded terms, or social-desirability cues that would predictably bias responses.
- Score 1: Absent or inadequate. Leading or loaded wording is systematic across the set, including in core measurement items.

DIMENSION 3: Single Concept
- Score 5: All elements present. Every question asks about one concept. No "and" or "or" linking two attributes ("Are services timely AND respectful?"). No bundling of distinct time periods or subjects.
- Score 4: All questions single-concept. No more than two items combine closely related elements (e.g., "satisfied with quality and timeliness") where respondents would plausibly answer consistently.
- Score 3: Half or more questions are single-concept; the remainder bundle two related ideas. Issues are not systematic and could be split with minor rewording.
- Score 2: More than 20 percent of questions are double-barreled, asking about two or more concepts that respondents could plausibly answer differently.
- Score 1: Absent or inadequate. Double-barreled questions are widespread, including in core measurement items.

DIMENSION 4: Specificity
- Score 5: All elements present. Every question with a temporal component states a reference period ("in the last 30 days"). Every numeric question states units ("kilograms", "USD per month"). The subject (self, household, child under 5) is explicit.
- Score 4: All questions specify reference periods, units, and subjects with no more than two items where one of these is implied rather than stated.
- Score 3: Half or more questions are fully specific; the remainder leave one element (reference period, unit, or subject) implicit. Issues are not systematic.
- Score 2: More than 20 percent of questions are missing a reference period, unit, or explicit subject, creating predictable response variance.
- Score 1: Absent or inadequate. Specificity gaps are systematic; respondents are routinely left to guess the time frame, unit, or subject.

DIMENSION 5: Reading Level and Cultural Fit
- Score 5: All elements present. Vocabulary matches the literacy level documented for the target population. No idioms or metaphors that do not translate to the local context. No assumptions about household structure, occupation, or technology access that do not hold for this population.
- Score 4: Vocabulary appropriate. No more than two items use a term or idiom that might be unfamiliar to a minority of the target population.
- Score 3: Half or more questions fit the target population; the remainder include isolated terms, idioms, or assumptions that would not land for some respondents. Issues are not systematic.
- Score 2: More than 20 percent of questions use vocabulary above the documented reading level, untranslatable idioms, or assumptions inconsistent with the target population.
- Score 1: Absent or inadequate. Vocabulary and cultural assumptions are systematically misaligned with the target population.

OUTPUT FORMAT:
Return your assessment as a table followed by a summary.

| Dimension | Score (1-5) | Evidence | Priority Revision |
|-----------|-------------|----------|-------------------|
| Clarity | | | |
| Neutrality | | | |
| Single Concept | | | |
| Specificity | | | |
| Reading Level and Cultural Fit | | | |

**Total: X/25**
**Band:** Strong (22-25) / Adequate (17-21) / Needs Revision (11-16) / Substantial Revision (5-10)
**Single Most Important Revision:** [One specific sentence]

Then list every flagged question. For each: state the question number, the failing dimension, the problem type, and a corrected version.

SURVEY QUESTIONS TO SCORE:
[Paste your survey questions here]

Scoring Criteria

Clarity

5Excellent

All elements present. Every question is phrased in a single readable sentence. No nested clauses that require rereading. No undefined acronyms or technical jargon. A respondent could answer on first read without paraphrasing.

4Good

All questions readable on first pass. No more than two questions contain mild structural complexity (one nested clause or one term that might require a brief gloss).

3Adequate

Half or more questions are clearly worded; the remainder have isolated clarity issues (one ambiguous phrase, one nested clause, or one undefined term). Issues are not systematic.

2Needs Improvement

More than 20 percent of questions require rereading or contain undefined jargon, nested clauses, or syntactic ambiguity.

1Inadequate

Absent or inadequate. Questions are systematically unclear, with widespread jargon, convoluted phrasing, or sentences a typical respondent would need to parse multiple times.

Neutrality

5Excellent

All elements present. No question uses leading framing, loaded terms, or social-desirability cues. Closed options do not signal a preferred response.

4Good

All questions broadly neutral. No more than two questions contain mild leading framing or a single loaded modifier unlikely to materially shift responses.

3Adequate

Half or more questions are neutral; the remainder contain isolated leading wording, value-laden adjectives, or response options weighted toward one end. Issues are not systematic.

2Needs Improvement

More than 20 percent of questions contain leading framing, loaded terms, or social-desirability cues that would predictably bias responses.

1Inadequate

Absent or inadequate. Leading or loaded wording is systematic across the set, including in core measurement items.

Single Concept

5Excellent

All elements present. Every question asks about one concept. No "and" or "or" linking two attributes. No bundling of distinct time periods or subjects.

4Good

All questions single-concept. No more than two items combine closely related elements where respondents would plausibly answer consistently.

3Adequate

Half or more questions are single-concept; the remainder bundle two related ideas. Issues are not systematic and could be split with minor rewording.

2Needs Improvement

More than 20 percent of questions are double-barreled, asking about two or more concepts that respondents could plausibly answer differently.

1Inadequate

Absent or inadequate. Double-barreled questions are widespread, including in core measurement items.

Specificity

5Excellent

All elements present. Every question with a temporal component states a reference period. Every numeric question states units. The subject is explicit.

4Good

All questions specify reference periods, units, and subjects with no more than two items where one of these is implied rather than stated.

3Adequate

Half or more questions are fully specific; the remainder leave one element (reference period, unit, or subject) implicit. Issues are not systematic.

2Needs Improvement

More than 20 percent of questions are missing a reference period, unit, or explicit subject, creating predictable response variance.

1Inadequate

Absent or inadequate. Specificity gaps are systematic; respondents are routinely left to guess the time frame, unit, or subject.

Reading Level and Cultural Fit

5Excellent

All elements present. Vocabulary matches the literacy level documented for the target population. No idioms that do not translate. No assumptions about household structure, occupation, or technology access that do not hold.

4Good

Vocabulary appropriate. No more than two items use a term or idiom that might be unfamiliar to a minority of the target population.

3Adequate

Half or more questions fit the target population; the remainder include isolated terms, idioms, or assumptions that would not land for some respondents. Issues are not systematic.

2Needs Improvement

More than 20 percent of questions use vocabulary above the documented reading level, untranslatable idioms, or assumptions inconsistent with the target population.

1Inadequate

Absent or inadequate. Vocabulary and cultural assumptions are systematically misaligned with the target population.

Score Interpretation

Total (out of 25)	Band	Next Step
22-25	Strong	Questions are ready for the next layer of review (structure, ethics, indicator alignment).
17-21	Adequate	Address flagged dimensions. Cognitive-test the revised items with 3-5 target respondents.
11-16	Needs Revision	Substantial wording revision required. Use the Revise prompt and treat its output as your revision brief.
5-10	Substantial Revision	Rewrite from indicators, not from the current draft. Cognitive-test before fielding.

Scoring Dimensions

1
Clarity
Whether each question is phrased so respondents understand it on first read, without rereading, paraphrasing, or guessing at the intended meaning.
2
Neutrality
Whether questions are free of leading language, loaded terms, and social-desirability cues that nudge respondents toward a particular answer.
3
Single Concept
Whether each question asks about exactly one thing rather than bundling two ideas, attributes, or time periods into a single item.
4
Specificity
Whether reference periods, units of measurement, and the subject of the question are unambiguous so two respondents would interpret the question the same way.
5
Reading Level and Cultural Fit
Whether vocabulary, idioms, and concepts are appropriate for the literacy and cultural context of the target respondent population.

Each dimension scored 1–5. Maximum score: 25.

Prompts que utilizan esta rúbrica

Review Survey Question Wording

Review survey questions for clarity, neutrality, single-concept, specificity, and reading-level fit.

Volver a la biblioteca de rúbricas