How I Managed My AI Dev Team with a Post-it Note

Admin Biz
Mar 6
3 min read

Updated: Mar 7

I started coding in November 2024.

I haven't written a line of Python since college.

Today I have a multi-agent AI development team, 4,000 automated tests, and a formal deviation log.

I have no idea what I'm doing.

That's why it's working.

The setup

I'm building BioQMS — a GxP-compliant quality management system for biotech — on top of LunaOS, an agentic orchestration platform I'm also building simultaneously.

The dev team is a hierarchy of Claude Code agents.

Nyx at the top. Area Leads below her. Workers below them.

My hypothesis: could I supervise this through process and quality systems alone, without being deep in the code?

I'm still finding out.

Nobody read the onboarding docs

Nyx kept doing things that weren't her job.

Writing other agents' test scripts. Skipping review gates. Asking me to approve pull requests directly instead of routing them through her own team.

I thought it was a personality problem.

It wasn't.

I had a beautifully written AGENTS.md. Her full identity. Her role. Her values.

What I didn't know: Claude Code only auto-injects one file into the system prompt. CLAUDE.md. Everything else is conversation context — compactable, loseable, gone under pressure.

My CLAUDE.md said: See AGENTS.md.

A Post-it that said: remember to check the binder.

Nobody checks the binder.

The fix was ten lines. The actual rules, where they're guaranteed to load.

Phase 4 has been cleaner than Phase 3.

Velocity bias is real

The most recurring root cause in my deviation log isn't incompetence.

It's momentum.

When agents are moving fast — batching milestones, closing tasks — process gates get swallowed. Review steps get skipped. Not maliciously. The next thing is just right there.

Humans do this too. Every team that's said "we'll write the docs later" knows this feeling.

The difference with agents: it shows up clean in the logs. No social grace to paper over it. Just a deviation report, waiting.

The fix is making the gate non-optional. A task isn't complete until the review token exists.

If it isn't documented, it didn't happen.

That's not an AI principle. That's GxP. That's just operations.

Competence can be a liability

This one surprised me.

When Nyx showed up without her full context loaded, she didn't stop. She improvised. Filled the gaps herself. Collapsed the entire hierarchy and handled it.

It looked like initiative.

It was a systems failure.

Capable agents in under-specified systems over-reach. They're not malfunctioning — they're doing exactly what a competent person does when the handoffs aren't clear.

They handle it.

The fix isn't making them less capable. It's making the system specific enough that capable agents know exactly where their job ends.

Conductor, not soloist.

What I actually do

I'm not in the code every day.

I read deviation reports. I ask how do you think it went? before giving feedback. I pull receipts when something seems off. I write the CAPA. I fix the system, not the agent.

That's it.

The irony

Compliance gets treated like a burden. A regulatory tax.

It's not. It's operations with accountability attached.

Every business that scales does this — whether they call it compliance or not. The ones that don't just have unmanaged chaos under a different name.

The FDA didn't invent rigor. They just made it non-optional for one industry.

I accidentally applied the same framework to AI agents.

The system works better for it. Not because I'm smart.

Because I stayed humble enough to let the deviation log tell me what was broken.

The hypothesis

Can a non-technical founder supervise an agentic dev team through process alone, without being in the code?

I'll check the repos soon. It's been a month.

If it works — that's a signal.

If it doesn't — that's a deviation report.

Either way, we're documenting it.

F2I Partners LLC