essayMarch 18, 20267 min read

The Bottleneck Was Never Intelligence

Every Monday for six months, I started from zero.

Every Monday for Six Months, I Started From Zero

I’d open a new Claude session, type three paragraphs explaining what my company builds, what we decided last week, which branch of code has the bug I need fixed, and what our psychologist advisor said about legal liability in California. By the time I finished re-explaining the context, I’d lost thirty minutes and most of my momentum.

My co-founder and I are building Mauralink — a two-person B2B SaaS startup that measures manager-employee compatibility using psychometric assessments. The product has eight behavioral scales, a narrative generation engine powered by LLM calls, 443 pre-authored content atoms extracted from 100+ research papers, and a customer discovery pipeline with 23 interviews across enterprise and mid-market segments. Every week, I’m context-switching between backend architecture, React components, customer interview synthesis, marketing copy, legal questions, and sprint planning.

And every week, the AI that was supposed to help me build faster couldn’t remember any of it.

This is the problem nobody talks about when they talk about AI-assisted development. The bottleneck isn’t the model’s intelligence. It’s context. Every conversation starts from a blank slate. Long sessions compress and degrade. The AI hallucinates things you never said. You spend more time re-establishing context than doing actual work.

In late 2024, I stopped accepting this. I built a system — a folder of markdown files that Claude reads on startup and writes to throughout the day. Six orchestration files that give an AI agent persistent state across sessions. A daily rhythm of logging, consolidating, and pruning that keeps the whole thing clean.

It took me four years to get here. This is what I learned.

The Four Eras

Each phase required abandoning the mental model from the phase before it.

2022

Copy-paste. ChatGPT 3.5 as a smarter search engine. Ask, receive, fix the bugs yourself. Zero persistence. The AI was a tool you picked up and put down.

2023

Conversation. Longer sessions, system prompts, few-shot examples. The quality of what you put in started to determine what came out. But still single-session, still ephemeral. Every Monday felt like day one.

2024

System.Claude Code. Filesystem access. Project-level instruction files. The AI could read your code and write to your files. One day I tried something primitive: I gave Claude a file to read at the start of every session and asked it to write to that file at the end. Suddenly, sessions had continuity. Not great continuity — but continuity.

2025–Now

Context engineering. The system I use now. An Obsidian vault as a shared knowledge base. Six orchestration files that auto-load on startup. Custom skills that automate session logging, daily consolidation, and multi-persona debates. AI as a persistent team member with structured memory and multi-session continuity.

You can’t build persistent context if you still think of AI as a search engine. You can’t do context engineering if you still treat every conversation as disposable.

What Context Engineering Actually Is

It’s not prompt engineering. It’s not vibe coding.

Context engineering is controlling the information environment your AI operates in so its output improves every session.

The implementation is embarrassingly simple: a folder of markdown files.

I use Obsidian because it treats a folder of .mdfiles as a knowledge base with bidirectional links, search, and a graph view. No proprietary format, no database lock-in. Claude reads and writes markdown natively. So the vault becomes a shared filesystem — the human and the AI working on the same source of truth.

The vault is organized by domain: product architecture, customer discovery findings, marketing brand script, legal advisor notes, research papers, engineering specs. But the part that changes everything is six small files in a directory called _Claude/:

status.mdCurrent state of every workstream. What’s done, what’s next, what’s blocked.

tasks.mdPrioritized task list with tags.

decisions.mdEvery significant decision with the evidence behind it.

context-map.mdWhich vault files to load for each type of work.

scratchpad.mdWorking memory, cleared daily.

learnings.mdOperational knowledge: SDK quirks, CLI bugs, Supabase JWT signing quirks. The tribal knowledge that, on a real team, lives in someone’s head.

These auto-load on every session start. Claude reads them, orients itself, and picks up work without re-explanation. When a decision happens mid-session, it’s written to decisions.md immediately. When context compresses (every long session hits this), Claude writes working state to scratchpad.md and status.mdfirst — so the session survives even after earlier conversation gets summarized away.

That’s it. Files on disk. No RAG pipeline. No vector database. No AI memory product. Just structured markdown that both the human and the AI can read, edit, and trust.

What This Looks Like in Practice

Nine personas. Two rounds. One decision brief.

Last month I needed to redesign our product’s main page. Instead of asking Claude what to do, I ran /debate— a custom skill that spawns multiple AI sub-agents as distinct personas to argue a decision.

Nine personas. A Customer Voice grounded in real interview quotes from a VP at Stryker and a CPO at Techstars. A Devil’s Advocate who cited that zero out of 23 customer conversations had mentioned the page in question. An Information Architect who pulled perceptual science research on why radar charts rank last in human comprehension accuracy. Three personas synthesized from actual customer interviews — speaking in their actual language and priorities.

Two rounds of structured debate. Challenges, concessions, evidence citations. The output was a decision brief with a recommendation, dissenting opinions, and implementation spec. I read it, agreed, overruled one detail, and we shipped that day.

I’ve run 24+ of these debates. Not just product design — knowledge base architecture, framework migration decisions (we cut 34 transitive dependencies), sprint roadmap prioritization. Every brief gets saved to the vault. Every subsequent session can reference the reasoning.

Two Saturdays ago at 2 AM, debugging a bug where personality profiles were coming out inverted. People scored as egalitarian were getting descriptions of hierarchical managers. Claude traced the root cause through three files — the atom direction filter was checking the wrong pole. Fixed it, then spun up a four-agent audit team to verify no other inversion paths existed in the pipeline. All eight scales came back clean.

The handoff note it wrote meant Sunday morning, I picked up exactly where we left off.

The daily rhythm holds it all together. Session starts: Claude loads the orchestration files and yesterday’s handoff. Session ends: /session-logsnapshots everything — files touched, decisions made, thinking that evolved. End of day: /end-of-day reads all session logs, distills a daily summary, prunes completed tasks, archives absorbed decisions, clears the scratchpad. Monday morning, the vault is clean and current.

The Orchestrator Model

I want to be precise about the division of labor.

I’m the orchestrator.I conduct the customer discovery interviews. I sit in the advisory meeting with the I-O psychologist and understand why hiring assessments create legal liability in California. I read every transcript and decide which insights reshape the product. I set sprint priorities based on what closes the first 10 customers. I review every line of code and reject what doesn’t meet the bar.

Claude is the execution partner. It builds what I design, researches what I’m curious about, and pushes back when my thinking has holes. When I said we should rip out Google’s framework and use raw Anthropic SDK calls, Claude didn’t just do it — it asked whether I’d considered the implications for our structured output pipeline, pointed out that output_config works differently than Instructor, and suggested running the full test suite before merging. That pushback is the point. I want a partner who pressure-tests my ideas, not one who rubber-stamps them.

Claude never decides what to build. It never sets product direction. It never chooses which customer insight matters most. Those are mine. Claude provides the throughput to execute those decisions at a pace that would normally require a team of five.

What You’re Losing by Not Doing This

Here’s what I don’t do anymore.

Spend thirty minutes every Monday re-explaining my project

Lose a decision made last Thursday because nobody wrote it down

Rediscover that the Anthropic SDK doesn’t support min_length > 1 on list fields — because learnings.md already has that

Spend a full day on a UI component that should take two hours

Wonder what I was working on before I got interrupted

Debug the same issue twice because the root cause wasn’t captured

My co-founder and I are building a product with a psychometric engine, an AI narrative pipeline, a knowledge base, an admin dashboard, a comparison tool, a marketing site, and active customer discovery across enterprise and mid-market. Pre-revenue. Two people.

That’s not because AI is doing our thinking. It’s because AI eliminated the bottleneck between having a thought and executing it.

If You Build One Thing

Forget the skills, the debates, the daily loop. Those came later. Start here:

Create a file called status.md. Write down what you’re working on, what’s done, and what’s blocked. Tell your AI agent to read it at the start of every session and update it at the end.

That single file is the difference between an AI that starts from zero and an AI that starts from where you left off. Everything else I’ve described — the vault, the orchestration layer, the debate system — grew from that one habit.

The model is the same for everyone. The context is everything.

References

Effective Context Engineering for AI Agents (Anthropic, 2025)
Claude Skills Repository
Obsidian