[meta]·6 min read·
i built a 30,000-line product in a month. here's what that actually looked like.
not a victory lap. not a vibe-coding essay. a diary of four weeks of building mazle with claude code, and where the real leverage actually showed up.
this isn't a victory lap. it isn't another "ai replaced my engineers" essay. it's a diary.
four weeks ago i had a spec, a domain, and a pretty clear idea of what mazle needed to do. i had no engineering team yet. i opened a terminal, started claude code, and just started building.
by week four there was about 30,000 lines of code in the repo and a working product on the other side of the screen. that's the headline. the interesting stuff is what happened in between.
week 1: the setup
the first week was not about shipping features. it was about setting up a loop i could iterate in.
i picked a boring stack. next.js 14, postgres, prisma, tailwind, vercel. nothing exotic. the goal was to pick the stack that the model had seen the most of, so it would make the fewest weird suggestions.
the main decision was about how i worked, not what i built. i settled on three rules pretty quickly:
-
write a detailed spec before every meaningful feature. not a ticket. a document. what it does, what the edge cases are, what the data model looks like, what it explicitly will not do.
-
never accept a diff i don't understand. if claude writes something i can't read line-by-line, i reject it and rewrite the spec until the output is readable.
-
commit every 30 minutes. because i was going to want to revert, and i was going to regret it if i couldn't.
week one was slow. i shipped maybe four small features. but the loop was there.
week 2: the multiplier moment
the thing no one tells you about building with an ai coding agent is that the velocity is not linear. it's discontinuous.
for about ten days, it felt like i was building at roughly the speed of a competent solo engineer who already knew the codebase. fast, but not magic. i'd type a spec, claude would produce a diff, i'd review it, push back two or three times, merge it. 90 minutes for a feature that would take me a day.
then one afternoon in week two i wrote a single spec for the interview capture pipeline. audio ingest, transcription, diarization, structured extraction, storage, redis queue, the whole thing. four files, 900 lines of implementation, two tests passing. about 40 minutes.
that sprint, by hand, would have been a week. it's the first time i actually paused and thought: oh. this is different.
the multiplier wasn't the writing speed. it was the fact that i could hold the entire system in my head while the agent filled in the scaffolding underneath.
week 3: where it broke
week three is the week every "ai-built a startup" essay leaves out.
auth broke. not in the code, exactly. in the model. i had two overlapping mental models for what a "user" was (one from the billing side, one from the team-invite side) and claude dutifully implemented both of them, halfway each. the bug surfaced three days later when someone invited a teammate and the teammate got another org's data in their dashboard.
that was a bad day.
state management got weird. we had optimistic ui in a few places where it shouldn't have been, and no optimistic ui in places where it should have been, because i'd never written a consistent spec for how the frontend should handle latency. so every feature had its own vibe.
the subtle bug that took me a full day to find was in the contradiction-detection logic. it was "working" the whole time. the outputs looked reasonable. but the scoring was drifting because of a cumulative rounding error in how we merged confidence scores across passes. no test caught it. claude certainly didn't catch it. i found it by staring at 300 rows of debug output and noticing a number that didn't match what it should have been.
this is the part that the "agents will replace engineers" people don't seem to have lived through. the agent is great at writing code. it is not great at holding a skeptical, adversarial stance toward its own output. that's still your job.
week 4: what i had to rewrite by hand
by week four i had a pretty good sense of what claude was good at and what i had to do myself.
what it did well. boilerplate. api handlers. form components. migrations. anything where the shape was obvious and there was a canonical pattern. data transformations. tests, once i wrote the first one by hand and it had a model to imitate.
what i rewrote by hand. the pricing logic, because the implementation was correct but didn't match how i actually wanted to charge. the transcript reconciliation code, because the agent kept choosing a cleaner abstraction that cost us 300ms per call. the prompts themselves, because claude is unsurprisingly terrible at writing prompts for claude.
i'd say about 85% of the code in the repo was written by the agent, 10% was heavily edited agent output, and 5% was written by hand. but the 5% is the 5% that makes the product work.
the real lesson
the meme is "ai will replace engineers." that's the wrong frame.
the actual shift is closer to this: a founder with clear taste and the discipline to write a good spec can now ship what used to require a team of four. the engineer doesn't disappear. the bar moves. the person who used to be a "senior engineer" is now doing the work of a "staff engineer plus tech lead plus product manager," and the agent does everything below that line.
this is great news if you're a senior engineer. you just got a very powerful set of hands. it's terrifying news if you're a junior engineer whose job was "be the extra pair of hands." that tier is disappearing. fast.
for founders, the practical implication is: you can now build much more of your own product than you used to be able to. whether you should is a different question. the answer depends on how much you trust your own taste. mine was, apparently, good enough for a month. ask me again in a year.
the number
30,143 lines of typescript. 6,221 lines of sql migrations. 1,407 lines of tests. 4 configs. 1 founder. 1 claude.
would i do it again? yes, but differently. i'd write the auth spec twice as long. i'd commit every 20 minutes, not every 30. i'd write the first integration test before the first feature, not after. and i'd trust my skeptical instincts faster when something smelled wrong.
the code works. the product ships. that's the receipt.
but the actual work, the part that mattered, was all the stuff the agent couldn't do.