Observation Checklist Quality

Plantillas de prompts de IA

Copie un prompt en Claude, ChatGPT o Gemini. Pegue su documento al final y ejecute.

Pegue un documento para obtener una evaluación de calidad con puntuación, evidencia y prioridades de revisión.

6,251 caracteres
You are an expert M&E data specialist with deep experience designing and reviewing structured observation tools for site visits, classroom observation, service-delivery observation, infrastructure assessment, and similar fieldwork contexts. Score the observation checklist I will provide using the rubric below.

SCORING RUBRIC - Observation Checklist Quality
Score each dimension 1-5 using these criteria:

DIMENSION 1: Observability
- Score 5: All elements present. Every checklist item is directly observable in the time the observer will spend at the site. No items require the observer to infer attitudes, knowledge, intentions, or off-site behaviors. Items asking about a state (e.g., "facility has running water") have an observable verification path. Items asking about a behavior name a window of observation (e.g., "during the lesson observed").
- Score 4: Most elements present. Nearly all items observable; one or two require minor inference or rely on the observer asking a brief verification question.
- Score 3: Observability is uneven. Several items require inference about non-observable states (knowledge, attitudes, intentions). Verification paths are implicit.
- Score 2: Multiple items are not observable in the available time. The observer is forced to guess or to rely on respondent self-report rather than observation.
- Score 1: Absent or inadequate. Items are systematically inferential. The tool is functioning as a survey of the observer's opinions rather than a record of observation.

DIMENSION 2: Inter-Rater Clarity
- Score 5: All elements present. Response criteria are specific enough that two observers at the same site at the same time would assign the same scores. Each rating point is anchored with a definition (e.g., for a 1-3 scale: "1 = no evidence visible", "2 = partial evidence visible", "3 = full evidence visible, with example"). Edge cases are addressed in notes. Scoring units are concrete.
- Score 4: Most elements present. Anchors are clear for nearly every item; one or two items rely on the observer's judgment with thin anchoring.
- Score 3: Anchoring is uneven. Several items use unanchored adjectives ("good", "adequate", "poor") without definitions, leaving room for inter-rater drift.
- Score 2: Most items rely on subjective adjectives or yes/no with no definition of what counts. Two observers would likely score the same site differently across many items.
- Score 1: Absent or inadequate. No anchoring anywhere. Scoring depends entirely on the observer's personal judgment.

DIMENSION 3: Field Usability
- Score 5: All elements present. Length is realistic for the observation slot (an observer can complete the checklist while moving through the site and engaging with the setting). Layout supports rapid scoring (clear sections, response options aligned with items, no need to flip between pages mid-observation). Scoring complexity is appropriate (mostly simple scales rather than nested calculations). Tool is paper-friendly or device-friendly as the field context requires.
- Score 4: Most elements present. Length and layout workable; one element such as scoring complexity or section ordering creates minor friction in the field.
- Score 3: Workable but creates noticeable friction. Length is borderline for the slot, or layout requires unnecessary back-and-forth.
- Score 2: Length, layout, or scoring complexity will significantly disrupt the observation. The observer will miss items or rush the scoring.
- Score 1: Absent or inadequate. Tool is not usable in field conditions as designed. Will produce missing data or post-hoc filling-in.

DIMENSION 4: Coverage
- Score 5: All elements present. The checklist covers the full set of observable behaviors, conditions, or practices relevant to the indicators or research questions it is meant to inform. No required observable element is missing. The checklist does not pad with items irrelevant to the indicators. Mapping from items to indicators or research questions is implicit or stated.
- Score 4: Most elements present. Coverage is nearly complete; one observable element relevant to an indicator is missing or thinly covered.
- Score 3: Coverage is partial. Several required observable elements are missing. Mapping from items to indicators is unclear.
- Score 2: Significant gaps. Multiple indicators have no corresponding observation items, or the checklist is padded with items that do not feed an indicator.
- Score 1: Absent or inadequate. No traceable link between checklist items and the indicators or research questions the tool is meant to serve.

DIMENSION 5: Open Notes Space
- Score 5: All elements present. The checklist provides room for qualitative observations alongside structured scores. Open-notes prompts are placed at appropriate points (per section or per item, depending on tool design). Context fields are included (date, time, observer, site, conditions). A space for unexpected findings or anomalies is included at the end.
- Score 4: Most elements present. Notes space provided in most sections; one element such as unexpected-findings space or context fields is partial.
- Score 3: Notes space is uneven. Open-notes appear in some sections but not where they are most needed. No dedicated space for unexpected findings.
- Score 2: Notes space is minimal or missing for the bulk of the tool. Observer cannot record context, qualitative signal, or anomalies.
- Score 1: Absent or inadequate. No open-notes space anywhere. The tool captures only structured scores, with no way to record context or unexpected observations.

OUTPUT FORMAT:
Return your assessment as a table followed by a summary:

| Dimension | Score (1-5) | Evidence | Priority Revision |
|-----------|-------------|----------|-------------------|
| Observability | | | |
| Inter-Rater Clarity | | | |
| Field Usability | | | |
| Coverage | | | |
| Open Notes Space | | | |

**Total: X/25**
**Band:** Strong (22-25) / Adequate (17-21) / Needs Revision (11-16) / Substantial Revision (5-10)
**Single Most Important Revision:** [One specific sentence]

For any dimension scored 1 or 2, add a brief explanation and a concrete revised item or layout suggestion.

OBSERVATION CHECKLIST TO SCORE:
[Paste your observation checklist here]

Scoring Criteria

Observability
5Excellent

Every item observable in the time available. State items have verification paths. Behavior items name an observation window.

4Good

Nearly all items observable; one or two require minor inference or a brief verification question.

3Adequate

Observability uneven. Several items require inference. Verification paths implicit.

2Needs Improvement

Multiple items not observable in the available time. Observer forced to guess or rely on self-report.

1Inadequate

Items systematically inferential. Tool functions as observer-opinion survey.

Inter-Rater Clarity
5Excellent

Each rating point anchored with a definition. Edge cases addressed. Scoring units concrete.

4Good

Anchors clear for nearly every item; one or two with thin anchoring.

3Adequate

Anchoring uneven. Unanchored adjectives ("good", "adequate") used in several items.

2Needs Improvement

Most items rely on subjective adjectives or yes/no with no criteria.

1Inadequate

No anchoring anywhere. Scoring depends entirely on observer judgment.

Field Usability
5Excellent

Length realistic for the slot. Layout supports rapid scoring. Scoring complexity appropriate. Tool fits the field medium.

4Good

Length and layout workable; one element such as scoring complexity creates minor friction.

3Adequate

Workable but with noticeable friction. Length borderline or layout requires back-and-forth.

2Needs Improvement

Significantly disrupts observation. Observer will miss items or rush.

1Inadequate

Not usable in field conditions as designed.

Coverage
5Excellent

All required observable elements present. No padding. Mapping from items to indicators implicit or stated.

4Good

Coverage nearly complete; one element relevant to an indicator missing or thinly covered.

3Adequate

Partial. Several required elements missing. Mapping unclear.

2Needs Improvement

Significant gaps. Multiple indicators have no corresponding items. Padding present.

1Inadequate

No traceable link between items and indicators or research questions.

Open Notes Space
5Excellent

Open-notes prompts at appropriate points. Context fields included. Unexpected-findings space at the end.

4Good

Notes space in most sections; one element such as unexpected-findings space partial.

3Adequate

Notes space uneven. Open-notes only in some sections. No dedicated unexpected-findings space.

2Needs Improvement

Minimal or missing notes space across the tool.

1Inadequate

No open-notes space anywhere.

Score Interpretation

Total (out of 25)BandNext Step
22-25StrongChecklist is ready for fielding. Brief observers, run a calibration round, and proceed.
17-21AdequateAddress flagged dimensions before fielding. Most likely fix: anchor a subjective rating scale or add open-notes space.
11-16Needs RevisionSubstantial revision required. Use the Revise prompt to fix observability, anchoring, and coverage gaps before any site visit.
5-10Substantial RevisionChecklist will not produce reliable, comparable observations as designed. Rebuild using the Generate prompt against the indicator list, then re-score.