Give an AI coding agent a real LIS task, watch it plan and loop, then reflect on where it needed supervision — and where you'd trust it to run alone.
Points: 2
Grading Criteria
Grading focuses on the reflection — depth of observation about how the agent worked (plan, loop moment, failure, trust question), not the polish of the final artifact. A messy result with a sharp reflection scores higher than a polished output with a thin one.
Core Competencies
- Skilling and Productivity (primary)
- Critical Analysis, Fluency, and Evaluation
- Technical Understanding