Take 02: Reinforcement Learning with Checklist Feedback

Back to Straight BS

I’m going to prefix this by saying I haven’t completely thought this through.

I read a paper — a partnership between Apple and Carnegie Mellon — about a different way to train AI. The typical approach is reinforcement learning with human feedback: AI gets a prompt, gives a response, a human scores it. What this paper proposes instead is reinforcement learning with checklist feedback. The AI builds an atomic checklist — strictly yes or no items — to accomplish a task, then works through it until everything’s done. No vague criteria. Just: did this happen or not.

That got me thinking about students.

The obvious version is: students create their own rubric, get graded on it. There’s already research on that. What I think is more interesting is the atomic checklist part specifically. Not a rubric with categories and point values — a list of yes/no steps that actually has to be completed to finish the project. To make that list, you have to genuinely understand the task. You have to be able to do it.

My theory is that builds buy-in. Students struggle to care about assessments. Faculty struggle to explain why their assessments matter. If the student makes the checklist, they’ve already done part of the thinking. Maybe that changes something.

I don’t know what this looks like in practice. It’s probably different for every course, every assessment. Might only work in capstone situations. Might create more work than it’s worth. I genuinely don’t know. I just think the atomic checklist part is interesting enough to keep stewing on.

If you’ve tried something like this or you know research that already covers it — correct me if I’m wrong — I’d actually like to hear about it.

Articles mentioned