New agentic coding features drop every week, and Claude Code keeps adding capabilities that catch my attention. But 3 MCP servers have been constants in my workflow for months now.
The Joy of Agentic Coding
I have to say upfront: agentic coding with AI tools like Claude Code is genuinely joyful. Watching Claude implement features, fix bugs, and even write tests feels almost magical. There's something deeply satisfying about describing what you want and seeing working code emerge.
But here's what I learned the hard way - when you completely let go of the reins and expect everything to just work without any effort on your part, that's when frustration hits. You're not writing code anymore, but you're definitely still working. The question is: where exactly do you fit in the loop?
My Test-Driven Development Experiment
When I first started with Claude Code, I tried following Anthropic's early advice about running eight terminal windows for test-driven development. I was actually excited about this - I genuinely loved TDD when I was coding manually.
The setup was elegant in theory: one terminal for writing tests, another for implementation, a third for running tests, and so on. Claude would write failing tests first, then implement features to make them pass.
What happened was... educational. I ended up with hundreds of tests, but as I kept adding new features to my projects, the development moved forward while the tests didn't reflect the evolving reality. I fell into the classic TDD trap - writing tests at the wrong abstraction level that ended up constraining rather than enabling development.
Looking back, I think I skipped or poorly handled the refactoring phases. The tests became anchors instead of safety nets.
Two Approaches: Short Iterations vs Long Agentic Sessions
Through experimentation, I've identified two distinct approaches to using Claude Code:
Short Iterations - Tightly Supervised: You actively monitor what Claude is doing, interrupt when needed, and course-correct frequently. This feels safer but can be slower.
Long Agentic Coding Sessions - Trust and Verify: Let Claude run for extended periods, only stepping in when it specifically asks for help or feedback. Much faster when it works, but requires better upfront setup.
I've gravitating toward the longer sessions, but with a crucial realization: the human-in-the-loop moments are absolutely critical in agentic coding. It's not about micromanaging the code - it's about being strategic about where you add value.
My Human-in-the-Loop Workflow Design
After months of iteration, here's where I've found I add the most value:
1. Requirements and Feature Ideas (Human-Led)
I define what I want to build, but I don't jump straight into implementation. Instead, I use the Linear MCP server to generate detailed tickets with user stories, acceptance criteria, and technical guidance. Claude analyzes the current codebase and provides implementation direction, but I make the strategic decisions about what features matter.
2. Implementation (Claude-Led with Strategic Tools)
This is where the MCP servers shine. Claude handles the actual coding using:
- Context7 MCP: For up-to-date documentation and best practices
- Playwright MCP: For browser testing and validation
- Linear MCP: For referencing requirements and updating progress
I've learned that web searches alongside Context7 really boost implementation accuracy. Yes, this introduces some prompt injection risk, but I catch those during validation.
3. Validation and Pull Requests (Human-Led)
Here's where I invest my time: I don't do detailed code reviews line-by-line. Instead, I pull the code, run it, test it manually, and see if it actually solves the problem I defined. The GitHub Actions run automatically, but I focus on whether the feature works as intended.
The Magic of Playwright Testing
The Playwright MCP server creates something I didn't expect - a genuine closed-loop development cycle. When Claude can control a real browser, fill forms, click buttons, and read console errors, it can debug and iterate without me.
This enables those longer agentic sessions I mentioned. Instead of me manually testing and reporting back what's broken, Claude can test its own work and fix issues autonomously. It's actually faster than the old feedback loop.
What I Use and Why
Context7: Eliminates the guesswork about current library APIs and patterns. When I say "use Context7" in my prompts, Claude gets real, up-to-date documentation instead of making educated guesses.
Playwright MCP: For web applications (which I typically build), this handles the testing and validation that would otherwise require constant manual intervention.
Linear MCP: Forces me to think through features before building them. The structured approach has made my projects more focused and less prone to feature creep.
{
"mcpServers": {
"context7": {
"type": "sse",
"url": "https://mcp.context7.com/sse"
},
"linear": {
"command": "npx",
"args": [
"-y",
"mcp-remote",
"https://mcp.linear.app/sse"
]
},
"playwright": {
"command": "npx",
"args": [
"-y",
"@executeautomation/playwright-mcp-server"
]
}
}
}
My .mcp.json
Areas I'm Still Exploring
Sub-agents: I've experimented with parallel development using separate backend and frontend agents. When they work, they're fantastic for research tasks since they can run simultaneously. The challenge is getting them to not over-engineer everything. I think this comes down to better prompting, which I'm still figuring out.
Larger Codebases: My experience is mostly with MVP-style projects. I honestly don't know how well this workflow scales to complex, enterprise-level applications.
The Broader MCP Ecosystem
Looking at what others are building gives me confidence this agentic coding approach has legs. IndyDevDan is betting his next decade on agentic software. Microsoft shipped an official Playwright MCP server. Linear built their own MCP integration. Upstash's Context7 is being adopted across multiple AI platforms.
The Model Context Protocol is solving something fundamental - instead of every AI tool needing custom integrations for every service, we get universal connectors. One Linear integration works with Claude, Cursor, Windsurf, and whatever launches next month.
What's Working for Me
The sweet spot I've found is being very intentional about requirements upfront, letting Claude handle implementation with good tooling, then being thorough about validation. I'm not trying to control every line of code - I'm designing a system where Claude can succeed. This is the new job of a developer.
Most importantly, I've stopped expecting perfection without effort. Agentic coding is incredibly powerful, but it's still development. The joy comes from the collaboration, not from complete automation.
Still Learning
This workflow probably has tons of room for optimization. I'm definitely not doing everything optimally, and I'm sure there are obvious improvements I'm missing. But for the first time, I feel like I have a sustainable approach to AI-assisted development that actually ships working software.
What patterns are you finding in your own AI development workflow? I'd love to hear what's working for others in this space.
Member discussion: