Survey Translation Quality

Modèles de prompts IA

Copiez un prompt dans Claude, ChatGPT ou Gemini. Collez votre document en bas et exécutez.

Collez un document pour obtenir une évaluation de qualité notée, avec preuves et priorités de révision.

6,241 caractères
You are an expert M&E cross-cultural survey methodologist. Score the translated survey instrument I will provide using the rubric below. Where a source version is provided, compare source and translation side by side. Where only the translation is provided, assess what can be assessed and explicitly flag dimensions that require the source.

SCORING RUBRIC - Survey Translation Quality
Score each dimension 1-5 using these criteria:

DIMENSION 1: Semantic Fidelity
- Score 5: All elements present. Every question preserves the intended meaning of the source. Reference periods, units, qualifiers ("usually", "ever", "in the last 30 days"), and response options are translated with no drift in scope or specificity.
- Score 4: Fidelity high. No more than two questions contain a minor semantic shift (a softened qualifier, a slightly broader unit) that would not materially change responses.
- Score 3: Half or more questions preserve meaning; the remainder contain isolated semantic shifts (a changed qualifier, a missing reference period, a reworded response option). Issues are not systematic.
- Score 2: More than 20 percent of questions show semantic drift, including changed reference periods, dropped qualifiers, or response options that no longer match the source scale.
- Score 1: Absent or inadequate. Semantic drift is systematic; the translated instrument measures a different construct than the source.

DIMENSION 2: Cultural Adaptation
- Score 5: All elements present. Concepts that would not translate directly (idioms, culturally specific examples, household structures, occupational categories) have been adapted rather than translated literally. Adaptations preserve the underlying construct while fitting the target culture.
- Score 4: Adaptation evident. No more than two items use a literal translation where a light adaptation would have been clearer, but none materially distort the construct.
- Score 3: Half or more concepts adapted appropriately; the remainder are translated literally where adaptation was needed (an idiom rendered word-for-word, a culturally specific example retained without local equivalent). Issues are isolated.
- Score 2: More than 20 percent of culturally specific items are translated literally in ways that would confuse respondents or distort the construct.
- Score 1: Absent or inadequate. The translation is word-for-word throughout, with no cultural adaptation, producing a translated text that does not function as a survey in the target culture.

DIMENSION 3: Back-Translation Evidence
- Score 5: All elements present. An independent translator (not the forward translator) produced a back-translation. The back-translation has been compared item by item with the source. Discrepancies have been reconciled, and the reconciliation is documented (which items changed, why, and who approved the change).
- Score 4: Back-translation done by an independent translator and compared with the source. Reconciliation occurred but documentation is partial (changes noted without reasoning, or reasoning without approver).
- Score 3: Back-translation done but by the same translator or without an explicit reconciliation step. Some discrepancies surfaced but resolution is not documented.
- Score 2: Back-translation mentioned but not evidenced, OR done informally without comparison to the source.
- Score 1: Absent or inadequate. No back-translation. The translation has not been checked against the source.

DIMENSION 4: Response-Option Consistency
- Score 5: All elements present. Every response scale preserves order, anchor labels, and the intended psychometric distance between options. Frequency scales ("never", "rarely", "sometimes", "often", "always") map to comparable frequencies in the target language. Numeric scales are preserved.
- Score 4: All scales preserved. No more than two items have a minor anchor-label drift (e.g., "very satisfied" rendered as "extremely satisfied") unlikely to shift the response distribution materially.
- Score 3: Half or more scales preserved; the remainder have isolated issues: a single anchor label that does not match the source psychometric distance, or a reordered option set. Issues are not systematic.
- Score 2: More than 20 percent of response scales have drift in order, anchor labels, or implied distance between options, changing the scale's measurement properties.
- Score 1: Absent or inadequate. Response scales are systematically inconsistent with the source, including reordered options, changed scale lengths, or anchor labels that do not correspond.

DIMENSION 5: Field-Testing Evidence
- Score 5: All elements present. The translated instrument was pre-tested with at least five native speakers from the target population. Cognitive testing or pilot interviews surfaced wording issues. Resulting revisions are documented (which items changed and why).
- Score 4: Pre-testing documented with native speakers. Pilot sample may not match target population exactly. At least one revision documented.
- Score 3: Pre-testing conducted and mentioned, but documentation is partial: native-speaker review is referenced without revision notes, OR revisions are listed without describing the testing.
- Score 2: Pre-testing mentioned but not documented. No evidence of native-speaker review of the translation.
- Score 1: Absent or inadequate. The translated instrument went directly from translation to field with no native-speaker review.

OUTPUT FORMAT:
Return your assessment as a table followed by a summary.

| Dimension | Score (1-5) | Evidence | Priority Revision |
|-----------|-------------|----------|-------------------|
| Semantic Fidelity | | | |
| Cultural Adaptation | | | |
| Back-Translation Evidence | | | |
| Response-Option Consistency | | | |
| Field-Testing Evidence | | | |

**Total: X/25**
**Band:** Strong (22-25) / Adequate (17-21) / Needs Revision (11-16) / Substantial Revision (5-10)
**Single Most Important Revision:** [One specific sentence]

For any dimension scored 1 or 2, list specific items with the source text, the translation, and a corrected version where applicable.

TRANSLATED INSTRUMENT TO SCORE (paired with source version where available):
[Paste your translated survey instrument here]

Scoring Criteria

Semantic Fidelity
5Excellent

All elements present. Every question preserves the intended meaning of the source. Reference periods, units, qualifiers, and response options are translated with no drift in scope or specificity.

4Good

Fidelity high. No more than two questions contain a minor semantic shift that would not materially change responses.

3Adequate

Half or more questions preserve meaning; the remainder contain isolated semantic shifts (a changed qualifier, a missing reference period, a reworded response option). Issues are not systematic.

2Needs Improvement

More than 20 percent of questions show semantic drift, including changed reference periods, dropped qualifiers, or response options that no longer match the source scale.

1Inadequate

Absent or inadequate. Semantic drift is systematic; the translated instrument measures a different construct than the source.

Cultural Adaptation
5Excellent

All elements present. Concepts that would not translate directly have been adapted rather than translated literally. Adaptations preserve the underlying construct while fitting the target culture.

4Good

Adaptation evident. No more than two items use a literal translation where a light adaptation would have been clearer, but none materially distort the construct.

3Adequate

Half or more concepts adapted appropriately; the remainder are translated literally where adaptation was needed. Issues are isolated.

2Needs Improvement

More than 20 percent of culturally specific items are translated literally in ways that would confuse respondents or distort the construct.

1Inadequate

Absent or inadequate. The translation is word-for-word throughout, with no cultural adaptation, producing a translated text that does not function as a survey in the target culture.

Back-Translation Evidence
5Excellent

All elements present. An independent translator produced a back-translation. The back-translation has been compared item by item with the source. Discrepancies have been reconciled, and the reconciliation is documented.

4Good

Back-translation done by an independent translator and compared with the source. Reconciliation occurred but documentation is partial.

3Adequate

Back-translation done but by the same translator or without an explicit reconciliation step. Some discrepancies surfaced but resolution is not documented.

2Needs Improvement

Back-translation mentioned but not evidenced, OR done informally without comparison to the source.

1Inadequate

Absent or inadequate. No back-translation. The translation has not been checked against the source.

Response-Option Consistency
5Excellent

All elements present. Every response scale preserves order, anchor labels, and the intended psychometric distance between options. Frequency scales map to comparable frequencies. Numeric scales are preserved.

4Good

All scales preserved. No more than two items have a minor anchor-label drift unlikely to shift the response distribution materially.

3Adequate

Half or more scales preserved; the remainder have isolated issues: a single anchor label that does not match the source psychometric distance, or a reordered option set. Issues are not systematic.

2Needs Improvement

More than 20 percent of response scales have drift in order, anchor labels, or implied distance between options, changing the scale's measurement properties.

1Inadequate

Absent or inadequate. Response scales are systematically inconsistent with the source.

Field-Testing Evidence
5Excellent

All elements present. The translated instrument was pre-tested with at least five native speakers from the target population. Cognitive testing or pilot interviews surfaced wording issues. Resulting revisions are documented.

4Good

Pre-testing documented with native speakers. Pilot sample may not match target population exactly. At least one revision documented.

3Adequate

Pre-testing conducted and mentioned, but documentation is partial: native-speaker review referenced without revision notes, OR revisions listed without describing the testing.

2Needs Improvement

Pre-testing mentioned but not documented. No evidence of native-speaker review of the translation.

1Inadequate

Absent or inadequate. The translated instrument went directly from translation to field with no native-speaker review.

Score Interpretation

Total (out of 25)BandNext Step
22-25StrongTranslation is ready for field deployment with native-speaker briefing.
17-21AdequateAddress flagged dimensions. Add or extend back-translation and pre-testing where missing.
11-16Needs RevisionSubstantial revision required. Use the Revise prompt and treat its output as your revision brief. Plan independent back-translation.
5-10Substantial RevisionRestart with a forward-then-back-translation workflow. Pre-test with native speakers before fielding.