Hatching Solutions  ·  OD Agent System

Agent Reference Guides

Operator reference documentation for all seven agents

SystemOD Agent Team CF Version2.0 Agents7
← Back to Console
Agent 01
Survey Instrument Designer
Psychometrically grounded survey instruments for organizational assessment

Purpose

The Survey Instrument Designer generates structured, psychometrically defensible survey instruments for organizational diagnostic engagements. Given a set of constructs the practitioner wants to measure — such as leadership effectiveness, psychological safety, or organizational culture — this agent produces a complete item set, organizes it into a deployable instrument, and outputs a format ready for upload to Qualtrics or equivalent platforms.

This agent eliminates the drafting burden that consumes significant time in the early stages of an OD engagement. It applies construct-level item generation logic, reverse-scoring logic, and response format conventions drawn from validated instrument development practice.

How It Works

Stage 1 — Structure Validation (Deterministic)

The agent validates that all required inputs are present and internally consistent: construct names are unique, demographic filters are recognized, and the delivery platform is supported. This stage runs without calling an LLM.

Stage 2 — Item Generation (LLM)

Using Claude Haiku, the agent generates 4–6 Likert-scale items per construct, with one reverse-scored item per construct where applicable. The prompt includes the engagement context (industry, organization size, anonymity mode), reference instruments, and any practitioner-defined sensitivity flags. Output includes a full item set with construct assignments, a recommended response scale, demographic filter questions, and deployment-ready formatting guidance.

Inputs Required

FieldTypeDescription
clientNameTextName of the client organization — used for instrument header and branding context
engagementNameTextEngagement or project name — used for internal tracking and document headers
industryContextSelectSector and workforce description, e.g. "Federal Defense — GS-12 to SES"
organizationSizeNumberApproximate headcount — informs item language and sampling guidance
diagnosticDimensionsMultiselectConstructs to measure, e.g. "Leadership Effectiveness," "Team Cohesion"
referenceInstrumentsCheckboxesKnown validated instruments to align with (Denison, Gallup Q12, OCAI)
anonymityModeSelect"Anonymous" or "Attributed" — affects item sensitivity and disclosure language
demographicFiltersTextDemographic breakdowns desired, e.g. "pay_grade, directorate"
platformTextSurvey delivery platform, e.g. "Qualtrics" — affects formatting conventions

Outputs Produced

OutputFormatDescription
Survey InstrumentStructured docComplete item set organized by construct with response scale and instructions
Item MappingTableMaps each item to its construct, reverse-scoring flag, and sensitivity classification
Deployment NotesTextPlatform-specific guidance for upload, branching logic, and completion estimates
Demographic BlockItem listRecommended demographic questions aligned to the specified filters

Theoretical Foundations

Theory / FrameworkSourceApplication
Classical Test TheorySpearman (1904); Lord & Novick (1968)Underpins item reliability logic, internal consistency targets, and reverse-scoring conventions
Construct ValidityCronbach & Meehl (1955)Guides item-construct alignment — items must represent the latent construct they purport to measure
Likert ScalingLikert (1932)Establishes the 5-point agree/disagree response format and balanced anchor phrasing
Denison Organizational Culture SurveyDenison & Mishra (1995)Reference framework for culture-domain items covering involvement, consistency, adaptability, and mission
Gallup Q12 Employee EngagementBuckingham & Coffman (1999)Reference framework for engagement items — informs construct coverage for team-level dynamics
Survey Design Best PracticesDillman, Smyth & Christian (2014)Governs item phrasing rules: single-barreled items, behavioral anchoring, and neutral midpoint use

Practical Notes

Operator NoteGenerated instruments should be reviewed by a qualified practitioner before deployment. The agent applies general best practices but cannot account for organization-specific language sensitivities, union considerations, or prior survey fatigue.
Agent 02
Interview Protocol Builder
Structured qualitative data collection protocols for organizational inquiry

Purpose

The Interview Protocol Builder develops structured and semi-structured interview protocols for qualitative data collection in OD engagements. Given the assessment dimensions, role hierarchy, and session logistics, this agent generates complete, role-differentiated interview guides — including warm-up questions, main probes, follow-up prompts, and closing sequences — calibrated to the seniority and organizational knowledge of each participant group.

This agent addresses a consistent bottleneck in mixed-methods OD work: protocol development is time-intensive, and off-the-shelf templates rarely reflect the specific constructs under investigation or the power dynamics of the audience.

How It Works

Stage 1 — Protocol Architecture (Deterministic)

The agent builds a protocol structure for each role group specified in the inputs, calculating time allocations, sequencing question types, and mapping each probe to the assessment dimensions it covers.

Stage 2 — Question Generation (LLM)

Using Claude Haiku, the agent generates role-appropriate questions for each session type (executive interview, focus group, listening session, stakeholder interview). Senior leaders receive strategic, forward-looking prompts. Front-line participants receive experience-level, operationally grounded questions. The LLM is constrained to produce questions that are behaviorally anchored, open-ended, and non-leading.

Inputs Required

FieldTypeDescription
clientName / engagementNameTextEngagement context for document headers and LLM framing
organizationContextTextNarrative description of the organization — workforce composition, culture notes, sensitivities
assessmentDimensionsMultiselectConstructs to probe qualitatively — must align with survey dimensions where applicable
roleHierarchyTextRoles from highest to lowest — used to calibrate language register and question depth
interviewTypesCheckboxesSession formats: executive interview, focus group, stakeholder interview, listening session
sessionCountNumberTotal number of sessions planned — informs sampling guidance in the protocol
targetRolesDynamic tableRole name, seniority level, and expected organizational knowledge — one row per role group
constraintsOptionsMax protocol length in minutes; recording and transcription guidance flags

Outputs Produced

OutputFormatDescription
Interview GuidesPer-role docsComplete facilitator script for each role group — warm-up, core probes, follow-ups, close
Question BankTableAll generated questions mapped to assessment dimension and session type
Sampling GuidanceTextRecommended participant counts and selection criteria per role group
Consent & DocumentationTextConsent language, recording protocol, and note-taking guidance
Time Allocation MapTableBreakdown of minutes per protocol section for each session type

Theoretical Foundations

Theory / FrameworkSourceApplication
Semi-Structured InterviewingKvale & Brinkmann (2009)Guides question sequencing: open openers, focused probes, hypothetical follow-ups, and member-checking prompts
Appreciative InquiryCooperrider & Srivastva (1987)Informs the affirmative question frame — protocols include strength-based probes alongside deficit-oriented ones
Organizational Sense-MakingWeick (1995)Shapes questions that surface how participants interpret ambiguous or changing situations
Phenomenological InterviewingMoustakas (1994)Grounds the lived-experience question structure — asking participants to describe specific events rather than general opinions
Power-Aware FacilitationFreire (1970); Schein (2009)Drives role differentiation — questions for executives differ structurally from front-line staff to account for positional dynamics
Grounded Theory SamplingGlaser & Strauss (1967)Informs the sampling guidance — theoretical saturation logic drives recommended participant counts per group

Practical Notes

Operator NoteProtocols should be piloted with one participant before full deployment. Probes generated by the LLM are starting points — experienced facilitators should adapt language in the room.
Agent 03
Content Curation Engine
Objective-aligned learning content selection for development programs

Purpose

The Content Curation Engine selects and packages learning content from the OD system's internal content library to support leadership development programs. Given a set of learning objectives, participant characteristics, and delivery constraints, the agent scores each available content item against the program requirements and returns a curated, relevance-ranked set of materials organized by module.

This agent addresses the content sourcing problem in program design: practitioners typically spend hours searching for materials, evaluating fit, and organizing them into a coherent sequence.

How It Works

Stage 1 — Library Scoring (Deterministic)

The agent iterates through all items in the content library index and computes a relevance score for each against the specified learning objectives. Scoring considers: objective keyword alignment, content type appropriateness for the delivery modality, participant seniority calibration, and industry context relevance. Items scoring below the minimum relevance threshold are excluded.

Stage 2 — Selection and Curation (LLM)

Using Claude Haiku, the agent selects the highest-scoring items up to the requested module count, writes a brief curatorial rationale for each selected item, and drafts facilitator notes explaining how to introduce each piece of content in the context of the program.

Current LimitationExternal search is architecturally supported but currently operates in degraded mode — the system curates from the internal library only (15 items). External retrieval will be activated in a future release. The degraded: true flag in metadata is expected and not a failure condition.

Inputs Required

FieldTypeDescription
programNameTextName of the development program — used for document headers and LLM context
learningObjectivesTextareaWhat participants should know, do, or value upon completion — one objective per line
industryContextSelectSector and audience description — used to weight content relevance for the specific context
participantLevelSelectEmerging Leader, Mid Manager, Senior Leader, or Executive — calibrates content sophistication
contentTypesCheckboxesAcceptable content formats: article, case study, video, framework, exercise, assessment
deliveryModeSelectIn Person, Virtual, or Hybrid — affects content type weighting
moduleCountNumberNumber of program modules — the agent selects content to populate each module
sessionDurationNumberTotal session length in minutes — used to estimate content volume appropriateness
minimumRelevanceScoreNumber0.0–1.0 threshold below which items are excluded. Default: 0.6

Outputs Produced

OutputFormatDescription
Curated Content PackageStructured listSelected items with relevance scores, rationale, and module assignments
Facilitator NotesPer-item textHow to introduce and debrief each content item in the context of the program objectives
Coverage ReportTableWhich learning objectives are covered by which content items — identifies gaps
Curation MetadataSummaryTotal items evaluated, items selected, average relevance score, coverage completeness

Theoretical Foundations

Theory / FrameworkSourceApplication
Bloom's Taxonomy (Revised)Anderson & Krathwohl (2001)Content is matched to the cognitive level required: remember, understand, apply, analyze, evaluate, create
Adult Learning Theory (Andragogy)Knowles (1980)Content selection favors materials that are experience-based, problem-centered, and immediately applicable
70-20-10 Development ModelMcCall, Lombardo & Morrison (1988)Experiential exercises weighted above passive reading; reflection prompts included to activate social learning
Situated Learning TheoryLave & Wenger (1991)Case studies and context-specific examples are weighted higher than abstract frameworks when industry context is specified
Cognitive Load TheorySweller (1988)Content volume recommendations are calibrated to session duration to avoid overloading participants
Transfer of TrainingBaldwin & Ford (1988)Content selection includes application exercises that bridge learning context to job context

Practical Notes

Agent 04
Quantitative Analysis
Psychometric scoring, reliability analysis, and subgroup comparison for survey data

Purpose

The Quantitative Analysis agent ingests raw survey export data from Supabase Storage, computes construct-level scores and reliability statistics, identifies statistically significant subgroup differences, and produces a narrative interpretation of the findings. It transforms a raw CSV file into a structured, analyst-ready quantitative findings package.

This agent removes the analytical bottleneck that occurs between data collection and synthesis. Practitioners no longer need to run manual SPSS or Excel calculations for standard psychometric outputs.

How It Works

Stage 1 — Data Loading and Validation (Deterministic)

The agent retrieves the survey export from the specified Supabase Storage path, parses the CSV, validates row counts and header integrity, and checks that all declared construct item IDs exist in the data. Errors at this stage halt execution with a specific, actionable error message.

Stage 2 — Scoring and Statistics (Deterministic)

For each construct, the agent: (1) applies reverse scoring to flagged items, (2) computes respondent-level mean or sum scores per the specified scoring formula, (3) computes construct-level mean, median, standard deviation, min, and max, (4) estimates Cronbach's alpha as a reliability indicator, and (5) computes benchmark deltas where a benchmark set is specified.

Stage 3 — Narrative Interpretation (LLM)

Using Claude Haiku, the agent produces a 2–3 sentence narrative for each construct summarizing the statistical pattern, notable subgroup differences, and any reliability concerns. These narratives feed directly into the Synthesis Report agent.

Inputs Required

FieldTypeDescription
surveyExportStoragePathTextPath to the CSV file within the survey-exports Supabase Storage bucket
exportFormatSelectCurrently: csv only. XLSX support is planned.
constructDefinitionsDynamic tableOne row per construct: name, ID, item IDs (CSV column headers), reverse-scored items
demographicFiltersTextDemographic fields to use for subgroup comparison, e.g. "pay_grade, directorate"
significanceThresholdNumberp-value threshold for reporting subgroup differences. Default: 0.05
organizationalContextSelectFederal, Corporate, Nonprofit, Healthcare — calibrates narrative interpretation language
engagementIdUUIDLinks this analysis run to the engagement record in the database

Outputs Produced

OutputFormatDescription
Construct ScoresTableMean, median, SD, min, max, Cronbach's alpha, and benchmark delta per construct
Subgroup DifferencesTableStatistically significant demographic differences with p-value, effect size, and practical significance flag
Response Quality FlagsListRespondents flagged for straight-lining, all-extreme responding, or low completion
Narrative SummariesPer-construct textLLM-generated interpretation for each construct — feeds into Synthesis Report
Analysis MetadataSummaryTotal respondents, valid respondents, data quality flags, and analysis parameters used

Theoretical Foundations

Theory / FrameworkSourceApplication
Classical Test TheoryLord & Novick (1968)Foundation for item scoring, construct mean computation, and reliability estimation via Cronbach's alpha
Cronbach's Coefficient AlphaCronbach (1951)Internal consistency reliability estimate — constructs below 0.70 are flagged as potentially unreliable
Cohen's Effect Size ConventionsCohen (1988)Interprets magnitude of subgroup differences: small (d=0.2), medium (d=0.5), large (d=0.8)
Nonparametric Significance TestingMann-Whitney (1947); Kruskal-Wallis (1952)Applied for small-n subgroups where normality assumptions fail — the agent selects tests automatically
Straight-Lining DetectionMeade & Craig (2012)Response quality flagging algorithm identifies respondents who selected the same response for every item

Practical Notes

Pre-RequisiteThis agent requires data to be pre-uploaded to the survey-exports Supabase Storage bucket before the run is initiated. The system does not accept file uploads directly through the UI.
Agent 05
Facilitation Guide Generator
Practitioner-ready facilitation guides for leadership development sessions

Purpose

The Facilitation Guide Generator produces complete, practitioner-ready facilitation guides for leadership development and organizational learning sessions. Given the program objectives, content modules, session logistics, and facilitator experience level, the agent generates a structured guide covering: session timeline, detailed activity instructions, facilitator scripts, debrief questions, contingency plans, and modality-specific notes.

This agent resolves the last-mile problem in program delivery: even when content is designed and approved, practitioners without deep facilitation experience often lack the scaffolding to run high-stakes sessions confidently.

How It Works

Stage 1 — Timeline Construction (Deterministic)

The agent computes a minute-by-minute session timeline from the module time allocations, inserting standard buffer time, breaks, and orientation blocks. The timeline is validated for mathematical completeness before LLM generation begins.

Stage 2 — Guide Generation (LLM — Streaming)

Using Claude Sonnet with a 32,000-token output budget (streaming required), the agent generates: overview with session goals and success criteria, preparation checklist, materials list, full activity detail blocks (purpose, setup, process steps, debrief questions, contingency options), facilitator tips, and appendices.

Stage 3 — Validation and Correction

A deterministic validator checks the generated guide for structural completeness: all timeline entries must have corresponding activity detail blocks, all learning objectives must be covered, and all debrief question sets must meet the minimum count. If validation fails, a correction prompt is issued to the LLM before the guide is finalized.

Stage 4 — File Export

The finalized guide is exported as a KH-branded DOCX and a Markdown file, both uploaded to the guides/ Supabase Storage bucket.

Inputs Required

FieldTypeDescription
programNameTextName of the development program — appears in guide header and all exported documents
programObjectivesTextareaOverall program learning objectives — one per line. Drive debrief question generation.
sessionDurationNumberTotal session length in minutes (30–480). Determines timeline architecture.
participantCountNumberNumber of participants — affects activity instructions, room setup notes, and group sizing guidance
deliveryModeSelectIn Person, Virtual, or Hybrid — generates modality-specific facilitation notes for each activity
facilitatorExperienceSelectNovice, Intermediate, or Expert — calibrates script depth and contingency guidance
organizationalContextSelectFederal, Corporate, Nonprofit, Healthcare — calibrates language register and example selection
contentModulesDynamic listOne card per module: title, learning objectives, time allocation, activities, summary, takeaways, discussion prompts, application exercise

Outputs Produced

OutputFormatDescription
Facilitator Guide (DOCX)KH-branded fileComplete practitioner guide with all sections — uploaded to guides/ bucket
Facilitator Guide (Markdown)Text filePlain-text version for digital sharing or LMS upload — uploaded to guides/ bucket
Session TimelineStructured dataMinute-by-minute schedule with activity titles and objective mappings
Activity→Objective MapStructured dataFeeds directly into the Evaluation Package Builder as chained input
Validation ReportMetadataRecords whether structural validation passed and any issues identified and corrected

Theoretical Foundations

Theory / FrameworkSourceApplication
Kolb's Experiential Learning CycleKolb (1984)Each activity block follows Concrete Experience → Reflective Observation → Abstract Conceptualization → Active Experimentation sequence
Transformative Learning TheoryMezirow (1991)Debrief questions are designed to surface and challenge assumptions — the "disorienting dilemma" is deliberately built into higher-stakes activities
Psychological SafetyEdmondson (1999)Facilitator scripts include explicit psychological safety framing at session open and after high-disclosure activities
Action LearningRevans (1982)Application exercise at end of each module operationalizes Revans' principle: learning requires real problems and reflective questioning
Scaffolded InstructionWood, Bruner & Ross (1976)The facilitatorExperience parameter controls scaffolding depth — novice guides include more prescriptive scripts
Kirkpatrick Level 3 (Behavior)Kirkpatrick (1959)Discussion prompts and application exercises are forward-facing — they ask participants to commit to specific behavioral changes

Practical Notes

Performance NoteThis agent produces large outputs (70KB+ for a two-module program). Generation takes 60–120 seconds. Do not navigate away from the run detail panel while the agent is running.
Agent 06
Synthesis Report
Mixed-methods triangulation and integrated OD diagnostic reporting

Purpose

The Synthesis Report agent integrates quantitative survey findings and qualitative interview findings into a unified organizational diagnostic report. It identifies convergent patterns (where both data sources agree), complementary patterns (where each source adds distinct information), and divergent patterns (where the sources are in tension), then generates validated recommendations with urgency, impact, and feasibility ratings.

This agent produces the primary client deliverable: the integrated OD assessment report. It replaces the synthesis step that practitioners typically spend the most time on — manually comparing two data sets, resolving discrepancies, and drafting a coherent narrative that holds both sources of evidence simultaneously.

How It Works

Stage 1 — Prerequisite Validation (Deterministic)

Both the quantitative findings (from Agent 04) and qualitative findings must have approvalStatus: "approved" before synthesis begins. This gate prevents synthesis on unapproved or potentially flawed upstream data.

Stage 2 — Triangulation Mapping (Deterministic)

The agent builds a triangulation map: for each assessed dimension, it classifies the relationship between quantitative and qualitative evidence as convergent, complementary, or divergent. Divergent findings trigger an interpretive note generation step.

Stage 3 — Report Generation (LLM)

Using Claude Sonnet, the agent generates three LLM outputs in sequence: (1) the integrated report narrative, (2) recommendations with ratings, and (3) the executive summary package. Separating these calls ensures each section receives adequate token budget.

Stage 4 — Validation

A claims validator checks that all recommendations are grounded in stated findings and that all divergent findings have been addressed.

Inputs Required

FieldTypeDescription
clientName / industryTextClient identity and sector — used in report header and LLM context framing
assessmentScopeSelectScope description for the report methodology section
clientPrioritiesTextareaStrategic priorities the client has communicated — recommendations are ranked partly on alignment with these
reportFormatSelect"Comprehensive" (full sections) or "Executive" (condensed) — controls report depth
quantitativeFindingsStructured dataApproved output from Agent 04 — contains construct scores and narratives by dimension
qualitativeFindingsStructured dataApproved qualitative data — contains dimension-level narrative summaries from interview analysis
engagementIdUUIDLinks the synthesis to the engagement record

Outputs Produced

OutputFormatDescription
Triangulation MapStructured dataConvergent, complementary, and divergent finding classifications per dimension
Findings by DimensionReport sectionsIntegrated narrative for each assessed dimension — with caveats where data quality requires them
Cross-Cutting ThemesReport sectionsPatterns that appeared consistently across multiple dimensions
RecommendationsRated tableActionable recommendations with urgency, impact, and feasibility ratings — plus rationale and implementation considerations
Executive SummaryDoc sectionPriority actions and leadership implications — designed for C-suite or SES-level audience
Report Preview (UI)Web viewOperator preview accessible from the Reports tab
Report DOCXDownloadFully formatted Word document — download from the Reports tab

Theoretical Foundations

Theory / FrameworkSourceApplication
Mixed Methods Research DesignCreswell & Plano Clark (2011)Convergent parallel design: quantitative and qualitative strands collected independently, merged at interpretation
TriangulationDenzin (1978)Data triangulation, methodological triangulation, and investigator triangulation applied — agent explicitly codes convergence, complementarity, and divergence
Organizational DiagnosisNadler & Tushman (1980)The Congruence Model informs recommendation framing: findings are interpreted as misalignments between inputs, strategy, work, people, and structure
Force Field AnalysisLewin (1951)Recommendations are structured as driving forces to amplify and restraining forces to reduce
Evidence-Based ODRousseau (2006)The claims validator enforces that every recommendation is explicitly grounded in stated findings

Practical Notes

Operator NoteThis agent produces the primary client deliverable. Review the report preview thoroughly before downloading the DOCX — the DOCX is the document that leaves the building.
Agent 07
Evaluation Package Builder
Kirkpatrick-aligned evaluation instruments for leadership development programs

Purpose

The Evaluation Package Builder generates a complete set of evaluation instruments for leadership development programs. Anchored to the program's learning objectives and facilitated activities, the agent produces pre-session baseline instruments, post-session learning gain instruments, session reaction surveys, and facilitator observation checklists — all aligned to the specific objectives of the program rather than generic course evaluation templates.

This agent solves a persistent gap in program evaluation practice: most organizations deploy generic "smile sheets" that measure satisfaction rather than learning. The agent generates instruments that directly trace back to program objectives, enabling practitioners to demonstrate learning gain and support Kirkpatrick Level 2 and Level 3 evaluation claims.

How It Works

Stage 1 — Objective Mapping (Deterministic)

The agent validates that the facilitation guide input has approvalStatus: "approved" and builds an objective map from the activityObjectiveMap structure — linking each program objective to the activities designed to develop it. Every instrument item is traceable to a specific objective.

Stage 2 — Instrument Generation (LLM)

Using Claude Sonnet, the agent generates four instruments simultaneously: (1) pre-session baseline, (2) post-session outcome instrument, (3) session reaction survey, and (4) facilitator observation checklist.

Stage 3 — Validation

A coverage validator checks that every program objective has at least one item in both the pre and post instruments. A structure validator checks item formatting, response type consistency, and instruction completeness.

Stage 4 — File Export

The evaluation package is exported as a KH-branded DOCX (all four instruments in one document) and an XLSX file (all instruments as separate tabs), both uploaded to the reports/ Supabase Storage bucket.

Inputs Required

FieldTypeDescription
facilitationGuideStructured dataApproved output from Agent 05 — provides program name, objectives, delivery mode, participant count, and activityObjectiveMap
evaluationUseCheckboxesIntended evaluation purposes: Learning Gain, Participant Reaction, Application Intent
organizationalContextSelectFederal, Corporate, Nonprofit, Healthcare — calibrates language register and item framing
engagementIdUUIDLinks the evaluation package to the engagement record

Outputs Produced

OutputFormatDescription
Pre-Session InstrumentSurvey docBaseline knowledge and attitude items — administered before the session begins
Post-Session InstrumentSurvey docParallel items to pre-session — computes learning gain by differencing responses
Session Reaction SurveySurvey docParticipant experience items: relevance, facilitator effectiveness, environment, and overall value
Facilitator Observation ChecklistChecklist docBehavioral indicators the facilitator or observer monitors during delivery
Evaluation Package DOCXKH-branded fileAll four instruments in a single formatted document — uploaded to reports/ bucket
Evaluation Package XLSXSpreadsheetAll four instruments as separate tabs — ready for data collection and analysis
Strategy SummaryTextNarrative describing the evaluation approach, instrument rationale, and scoring guidance

Theoretical Foundations

Theory / FrameworkSourceApplication
Kirkpatrick's Four LevelsKirkpatrick (1959); Kirkpatrick & Kirkpatrick (2016)Package measures Level 1 (Reaction), Level 2 (Learning), and lays groundwork for Level 3 (Behavior)
Pre-Post Quasi-Experimental DesignCampbell & Stanley (1963)Pre and post instruments are structurally parallel to enable learning gain calculation. Design limitation (no control group) noted in strategy summary.
Transfer of TrainingBaldwin & Ford (1988)Application intent items operationalize the motivation-to-transfer construct: "I intend to use X within Y weeks" format
Objective-Referenced AssessmentPopham (1978)Every item in the pre/post instruments maps directly to a stated learning objective
Brinkerhoff's Success Case MethodBrinkerhoff (2003)The facilitator observation checklist surfaces best-case and worst-case behavioral indicators during delivery

Practical Notes

Pre-RequisiteThe Evaluation Package Builder is designed to be run after the Facilitation Guide is approved. Its activityObjectiveMap is the critical linking structure — without accurate activity-to-objective mappings, coverage validation will fail.