Scoring Criteria
All elements present. Every question is phrased in a single readable sentence. No nested clauses that require rereading. No undefined acronyms or technical jargon. A respondent could answer on first read without paraphrasing.
All questions readable on first pass. No more than two questions contain mild structural complexity (one nested clause or one term that might require a brief gloss).
Half or more questions are clearly worded; the remainder have isolated clarity issues (one ambiguous phrase, one nested clause, or one undefined term). Issues are not systematic.
More than 20 percent of questions require rereading or contain undefined jargon, nested clauses, or syntactic ambiguity.
Absent or inadequate. Questions are systematically unclear, with widespread jargon, convoluted phrasing, or sentences a typical respondent would need to parse multiple times.
All elements present. No question uses leading framing, loaded terms, or social-desirability cues. Closed options do not signal a preferred response.
All questions broadly neutral. No more than two questions contain mild leading framing or a single loaded modifier unlikely to materially shift responses.
Half or more questions are neutral; the remainder contain isolated leading wording, value-laden adjectives, or response options weighted toward one end. Issues are not systematic.
More than 20 percent of questions contain leading framing, loaded terms, or social-desirability cues that would predictably bias responses.
Absent or inadequate. Leading or loaded wording is systematic across the set, including in core measurement items.
All elements present. Every question asks about one concept. No "and" or "or" linking two attributes. No bundling of distinct time periods or subjects.
All questions single-concept. No more than two items combine closely related elements where respondents would plausibly answer consistently.
Half or more questions are single-concept; the remainder bundle two related ideas. Issues are not systematic and could be split with minor rewording.
More than 20 percent of questions are double-barreled, asking about two or more concepts that respondents could plausibly answer differently.
Absent or inadequate. Double-barreled questions are widespread, including in core measurement items.
All elements present. Every question with a temporal component states a reference period. Every numeric question states units. The subject is explicit.
All questions specify reference periods, units, and subjects with no more than two items where one of these is implied rather than stated.
Half or more questions are fully specific; the remainder leave one element (reference period, unit, or subject) implicit. Issues are not systematic.
More than 20 percent of questions are missing a reference period, unit, or explicit subject, creating predictable response variance.
Absent or inadequate. Specificity gaps are systematic; respondents are routinely left to guess the time frame, unit, or subject.
All elements present. Vocabulary matches the literacy level documented for the target population. No idioms that do not translate. No assumptions about household structure, occupation, or technology access that do not hold.
Vocabulary appropriate. No more than two items use a term or idiom that might be unfamiliar to a minority of the target population.
Half or more questions fit the target population; the remainder include isolated terms, idioms, or assumptions that would not land for some respondents. Issues are not systematic.
More than 20 percent of questions use vocabulary above the documented reading level, untranslatable idioms, or assumptions inconsistent with the target population.
Absent or inadequate. Vocabulary and cultural assumptions are systematically misaligned with the target population.
Score Interpretation
| Total (out of 25) | Band | Next Step |
|---|---|---|
| 22-25 | Strong | Questions are ready for the next layer of review (structure, ethics, indicator alignment). |
| 17-21 | Adequate | Address flagged dimensions. Cognitive-test the revised items with 3-5 target respondents. |
| 11-16 | Needs Revision | Substantial wording revision required. Use the Revise prompt and treat its output as your revision brief. |
| 5-10 | Substantial Revision | Rewrite from indicators, not from the current draft. Cognitive-test before fielding. |