Observation Checklist Quality

Modèles de prompts IA

Copiez un prompt dans Claude, ChatGPT ou Gemini. Collez votre document en bas et exécutez.

Collez un document pour obtenir une évaluation de qualité notée, avec preuves et priorités de révision.

6,251 caractères

You are an expert M&E data specialist with deep experience designing and reviewing structured observation tools for site visits, classroom observation, service-delivery observation, infrastructure assessment, and similar fieldwork contexts. Score the observation checklist I will provide using the rubric below.

SCORING RUBRIC - Observation Checklist Quality
Score each dimension 1-5 using these criteria:

DIMENSION 1: Observability
- Score 5: All elements present. Every checklist item is directly observable in the time the observer will spend at the site. No items require the observer to infer attitudes, knowledge, intentions, or off-site behaviors. Items asking about a state (e.g., "facility has running water") have an observable verification path. Items asking about a behavior name a window of observation (e.g., "during the lesson observed").
- Score 4: Most elements present. Nearly all items observable; one or two require minor inference or rely on the observer asking a brief verification question.
- Score 3: Observability is uneven. Several items require inference about non-observable states (knowledge, attitudes, intentions). Verification paths are implicit.
- Score 2: Multiple items are not observable in the available time. The observer is forced to guess or to rely on respondent self-report rather than observation.
- Score 1: Absent or inadequate. Items are systematically inferential. The tool is functioning as a survey of the observer's opinions rather than a record of observation.

DIMENSION 2: Inter-Rater Clarity
- Score 5: All elements present. Response criteria are specific enough that two observers at the same site at the same time would assign the same scores. Each rating point is anchored with a definition (e.g., for a 1-3 scale: "1 = no evidence visible", "2 = partial evidence visible", "3 = full evidence visible, with example"). Edge cases are addressed in notes. Scoring units are concrete.
- Score 4: Most elements present. Anchors are clear for nearly every item; one or two items rely on the observer's judgment with thin anchoring.
- Score 3: Anchoring is uneven. Several items use unanchored adjectives ("good", "adequate", "poor") without definitions, leaving room for inter-rater drift.
- Score 2: Most items rely on subjective adjectives or yes/no with no definition of what counts. Two observers would likely score the same site differently across many items.
- Score 1: Absent or inadequate. No anchoring anywhere. Scoring depends entirely on the observer's personal judgment.

DIMENSION 3: Field Usability
- Score 5: All elements present. Length is realistic for the observation slot (an observer can complete the checklist while moving through the site and engaging with the setting). Layout supports rapid scoring (clear sections, response options aligned with items, no need to flip between pages mid-observation). Scoring complexity is appropriate (mostly simple scales rather than nested calculations). Tool is paper-friendly or device-friendly as the field context requires.
- Score 4: Most elements present. Length and layout workable; one element such as scoring complexity or section ordering creates minor friction in the field.
- Score 3: Workable but creates noticeable friction. Length is borderline for the slot, or layout requires unnecessary back-and-forth.
- Score 2: Length, layout, or scoring complexity will significantly disrupt the observation. The observer will miss items or rush the scoring.
- Score 1: Absent or inadequate. Tool is not usable in field conditions as designed. Will produce missing data or post-hoc filling-in.

DIMENSION 4: Coverage
- Score 5: All elements present. The checklist covers the full set of observable behaviors, conditions, or practices relevant to the indicators or research questions it is meant to inform. No required observable element is missing. The checklist does not pad with items irrelevant to the indicators. Mapping from items to indicators or research questions is implicit or stated.
- Score 4: Most elements present. Coverage is nearly complete; one observable element relevant to an indicator is missing or thinly covered.
- Score 3: Coverage is partial. Several required observable elements are missing. Mapping from items to indicators is unclear.
- Score 2: Significant gaps. Multiple indicators have no corresponding observation items, or the checklist is padded with items that do not feed an indicator.
- Score 1: Absent or inadequate. No traceable link between checklist items and the indicators or research questions the tool is meant to serve.

DIMENSION 5: Open Notes Space
- Score 5: All elements present. The checklist provides room for qualitative observations alongside structured scores. Open-notes prompts are placed at appropriate points (per section or per item, depending on tool design). Context fields are included (date, time, observer, site, conditions). A space for unexpected findings or anomalies is included at the end.
- Score 4: Most elements present. Notes space provided in most sections; one element such as unexpected-findings space or context fields is partial.
- Score 3: Notes space is uneven. Open-notes appear in some sections but not where they are most needed. No dedicated space for unexpected findings.
- Score 2: Notes space is minimal or missing for the bulk of the tool. Observer cannot record context, qualitative signal, or anomalies.
- Score 1: Absent or inadequate. No open-notes space anywhere. The tool captures only structured scores, with no way to record context or unexpected observations.

OUTPUT FORMAT:
Return your assessment as a table followed by a summary:

| Dimension | Score (1-5) | Evidence | Priority Revision |
|-----------|-------------|----------|-------------------|
| Observability | | | |
| Inter-Rater Clarity | | | |
| Field Usability | | | |
| Coverage | | | |
| Open Notes Space | | | |

**Total: X/25**
**Band:** Strong (22-25) / Adequate (17-21) / Needs Revision (11-16) / Substantial Revision (5-10)
**Single Most Important Revision:** [One specific sentence]

For any dimension scored 1 or 2, add a brief explanation and a concrete revised item or layout suggestion.

OBSERVATION CHECKLIST TO SCORE:
[Paste your observation checklist here]

Scoring Criteria

Observability

5Excellent

Every item observable in the time available. State items have verification paths. Behavior items name an observation window.

4Good

Nearly all items observable; one or two require minor inference or a brief verification question.

3Adequate

Observability uneven. Several items require inference. Verification paths implicit.

2Needs Improvement

Multiple items not observable in the available time. Observer forced to guess or rely on self-report.

1Inadequate

Items systematically inferential. Tool functions as observer-opinion survey.

Inter-Rater Clarity

5Excellent

Each rating point anchored with a definition. Edge cases addressed. Scoring units concrete.

4Good

Anchors clear for nearly every item; one or two with thin anchoring.

3Adequate

Anchoring uneven. Unanchored adjectives ("good", "adequate") used in several items.

2Needs Improvement

Most items rely on subjective adjectives or yes/no with no criteria.

1Inadequate

No anchoring anywhere. Scoring depends entirely on observer judgment.

Field Usability

5Excellent

Length realistic for the slot. Layout supports rapid scoring. Scoring complexity appropriate. Tool fits the field medium.

4Good

Length and layout workable; one element such as scoring complexity creates minor friction.

3Adequate

Workable but with noticeable friction. Length borderline or layout requires back-and-forth.

2Needs Improvement

Significantly disrupts observation. Observer will miss items or rush.

1Inadequate

Not usable in field conditions as designed.

Coverage

5Excellent

All required observable elements present. No padding. Mapping from items to indicators implicit or stated.

4Good

Coverage nearly complete; one element relevant to an indicator missing or thinly covered.

3Adequate

Partial. Several required elements missing. Mapping unclear.

2Needs Improvement

Significant gaps. Multiple indicators have no corresponding items. Padding present.

1Inadequate

No traceable link between items and indicators or research questions.

Open Notes Space

5Excellent

Open-notes prompts at appropriate points. Context fields included. Unexpected-findings space at the end.

4Good

Notes space in most sections; one element such as unexpected-findings space partial.

3Adequate

Notes space uneven. Open-notes only in some sections. No dedicated unexpected-findings space.

2Needs Improvement

Minimal or missing notes space across the tool.

1Inadequate

No open-notes space anywhere.

Score Interpretation

Total (out of 25)	Band	Next Step
22-25	Strong	Checklist is ready for fielding. Brief observers, run a calibration round, and proceed.
17-21	Adequate	Address flagged dimensions before fielding. Most likely fix: anchor a subjective rating scale or add open-notes space.
11-16	Needs Revision	Substantial revision required. Use the Revise prompt to fix observability, anchoring, and coverage gaps before any site visit.
5-10	Substantial Revision	Checklist will not produce reliable, comparable observations as designed. Rebuild using the Generate prompt against the indicator list, then re-score.

Scoring Dimensions

1
Observability
Whether each checklist item is actually observable in the time available at the observation site, rather than asking the observer to infer states, attitudes, or behaviors that cannot be directly seen.
2
Inter-Rater Clarity
Whether response criteria are specific enough that two observers scoring the same site at the same time would assign the same scores, with anchored definitions for each rating point.
3
Field Usability
Whether the checklist is usable in real field conditions, with a length, layout, and scoring complexity that an observer can manage while moving through a site and engaging with the setting.
4
Coverage
Whether the checklist covers the full set of observable behaviors, conditions, or practices relevant to the indicators or research questions it is meant to inform.
5
Open Notes Space
Whether the checklist provides room for qualitative observations, context notes, and unexpected findings alongside the structured scores, so signal not captured by the closed items can still be recorded.

Each dimension scored 1–5. Maximum score: 25.

Prompts utilisant cette rubrique

Review an Observation Checklist

Review a structured observation checklist for observability, inter-rater clarity, field usability, coverage, and open-notes space.

Retour à la bibliothèque de rubriques