Coding Agents Love God Components

A pattern

One predominant anti-pattern that I've tended to witness using coding agents is that, even given strong initial planning documents, over enough iterations of conversation and implementation, they tend to converge on God Components, i.e. single large files that own too much data and too much responsibility. This happened to me in brandmodal, the AI agent team for online presence platform I'm building with Kukesh Kodeswaran. One component started out just walking new users through onboarding. A dozen "actually the flow needs this too" requests later, the same file owned the wizard state, brand capture, provisioning, completion, resume logic... every feature in the one place that was already open. That one took time to refactor and deepen, but the codebase is much stronger for it.

The refactor only treated the symptom. The real cost had been accumulating silently the whole time. Each new request adds a little more to the file that's already open, and along the way basic programming principles, especially the single-responsibility principle (SRP) and testability, quietly get abandoned. It is (hopefully) not a conscious decision; the file just accretes, one iteration and one feature at a time. And a strong planning doc doesn't seem to get you out of it. A recent analysis of LLM- and agent-driven development found that as models get more capable they generate "increasingly bloated and coupled code," and that "neither functional correctness nor detailed prompting mitigates this decay" (Zhu et al., AI-Generated Smells, 2026). That tracks with my experience: the better the spec, the more it surprises you when the God Component turns up anyway.

I felt all of this long before I could name it. I'd kept hitting the same wall when quickly whipping something together with Claude Code or Codex, as the codebase increased in complexity, without a clear way to fix it. These models are verbose, and the output is often hard to audit, especially on frontend work, coming from a background mainly in Python. I then stumbled onto Matt Pocock's /skills, started using them, and only afterwards watched his talk, Software Fundamentals Matter More Than Ever. It put words to exactly the discomfort I'd been feeling, and added a lot I hadn't worked out yet. If you're building with coding agents, I can't recommend the talk and his skills enough. A fair bit of this piece is me thinking through what it crystallised.

Why it happens

It appears that the path of least resistance for a coding model is to add to what's already existing, not to carve a new seam. In doing so it directly violates the principle of encapsulation, which can be framed from first principles as: "An object alone is responsible for its own data and methods." A God Component is the inverse, becoming responsible for practically everything's data and methods.

And the agent has little incentive to do otherwise. It optimises locally, against the immediate ask (in essence, "make the feature work" or "make the test pass"), without a running model of the architecture as something to be preserved or expanded upon. The AI-Generated Smells study names the bias directly: AI engineering is "often measured by functional correctness, overlooking the critical issue of long-term maintainability". The agent optimises what it's graded on (read: benchmaxxing), and that's not typically maintainability. Creating the new necessary module is a cost it has to pay now for a benefit it won't be measured on, but appending to the file already in context is free and invisible. So it appends by default, and the file grows.

The industry data points the same way: across 211 million changed lines, the share of code associated with refactoring fell from around 25% in 2021 to under 10% in 2024, while copy-pasted code rose to overtake moved (reused) lines for the first time (GitClear, 2025). Zhu et al. saw the same inline-duplication habit up close, agents rewriting logic in place rather than pulling it into a helper. Agents are good at adding lines; reuse and restructuring are what slip.

Abstraction, on the other hand, is "hiding the complex reality while exposing only the necessary parts" and the industry name for a module that does this well is a deep module — a simple interface over a deep implementation — coined by Ousterhout in A Philosophy of Software Design. A God Component is the shape agents tend to reach for instead: a shallow module, where the interface sprawls as wide as what's behind it, because every new responsibility is bolted onto the same surface rather than hidden behind a new one.

Illustration labelled Deep Module and Shallow Module: the Deep Module has a compact control surface over a large hidden implementation, while the Shallow Module has a broad exposed control surface over thin internals.

And it can be subtler than one bloated file. Zhu et al. found agents will split work across several files and still miss the point in a "modular mirage" where "file separation does not equate to logical separation." The directory looks clean while the responsibilities stay tangled.

Why it actually costs you

At scale, the dominant failure Zhu et al. measured was exactly this shape, in which agents "centralize complex logic into singular 'manager' classes rather than delegating behavior, resulting in brittle, tightly coupled components that are difficult to test". The more an agent wrote, the worse it got: code volume was the single strongest predictor of architectural decay (ρ = 0.94).

Among other effects, the impacts of God Components are:

Component entanglement: errors from all the components are intertwined and difficult to disentangle their individual impact. One thing breaks and you can't say which concern caused it, because they all share the same blast radius.
Change amplification: software engineering has named this failure mode for fifty years, and a God Component triggers every version of it. Structurally it's tight coupling (Stevens et al., 1974); under maintenance it surfaces as the ripple effect, where one change propagates unpredictably into code you weren't touching (Yau et al., 1978); and Ousterhout names the symptom directly: change amplification, where a change that should be simple instead demands edits in many places. The sharp end of it is that the effect isn't even monotonic: because the concerns are entangled, improving one part can frequently make the whole worse rather than better. One file, no boundaries, so nothing stays local. (The machine-learning world rediscovered the same effect recently enough to give it a snappier name, as ML/AI research tends to do: the CACE principle, "Changing Anything Changes Everything" (Sculley et al., 2015), which describes how the hidden technical debt of ML systems — the inextricable entanglement of inputs — is what necessitates rigorous MLOps.)
Blame becomes impossible to assign: Modular infrastructure makes debugging and assigning blame easier, as each component has its own specification. A God Component erases that: when the bug report lands there's no boundary to point at, because the spec for "this part" is the spec for the whole file.

A fix

In his talk Matt Pocock makes the case that vibe coding doesn't have to mean abandoned principles, just that the guardrails have to be encoded as repeatable steps rather than one-off cleanups you remember to ask for. His /skills do exactly that, and through iteration I landed on a loop that chains a few of them around each piece of work:

Grill the plan before any code. /grill-with-docs interrogates the plan against the existing domain model, refines the terminology and hardens it into a detailed, unambiguous planning doc. The ambiguity you leave in can certainly become the ambiguity the agent fills with a God Component.
Break the doc into tracked units. Turn it into discrete Linear projects and items, so the work has seams before the code does; each item one vertical slice, not "go build the feature."
Pin each overarching feature to its tests. Set a /goal on the Linear item and the doc, with the tests that define done, so the agent is aiming at a spec, not just "make it work."
Deepen after every feature/project is completed. Run /improve-codebase-architecture: it looks for where the design has gone shallow and proposes ranked ways to deepen it, so you decide which seam to cut rather than whether to go looking. Doing this earlier rather than later keeps each refactor small enough to actually land.

Repeated running of Step 4 is the point, as none of this is a one-time solution. The God-Component shape tends to creep back whenever larger feature additions impinge on existing architecture, so it has to be a standing part of the loop instead of a huge cleanup you finally do once individual components/files are too big to read. Concretely: in brandmodal that onboarding component (an OnboardingProvider) went from a ~3,000-line God Component to a ~1,200-line orchestrator, with state, resume, transitions, provisioning and account-completion each extracted into its own focused module, backed by an ADR recording the decision.

That loop is mostly preventive as it keeps boundaries from collapsing as you build. However, one could presume that at this point, most codebases already have a God Component or two buried in them, and the modular mirage means you often can't spot them by reading the directory tree alone. graphify is a good complement on the detection side. It turns a codebase (or docs, or almost anything) into a knowledge graph and surfaces its god nodes: the most-connected components that everything flows through, ranked by connectivity. Because it maps the actual relationships rather than the file layout, it sees past the mirage, flagging a node as central even when its responsibilities are scattered across a tidy-looking set of files. So one line of attack could be to find the god node first, then point the deepening loop at it.

A deep module = simple interface, complex implementation. Steer the agent toward that on purpose, in the prompt, before it reaches for the open file. I can't claim this is a full fix (after all, as the AI-Generated Smells study showed, the models are pretty much trained to write code this way), but I've found it certainly helps, both improving the codebase and keeping me (relatively more) sane while working on it!

Takeaways

Agents add; they rarely subtract. Left unsupervised they keep collapsing boundaries until changing anything changes everything. The architecture is one of the things they won't have the taste to decide for you yet, so you will need to help them get it.

References

Stevens, W. P., Myers, G. J., & Constantine, L. L. (1974). "Structured Design." IBM Systems Journal 13(2), 115–139.
Yau, S. S., Collofello, J. S., & MacGregor, T. M. (1978). "Ripple Effect Analysis of Software Maintenance." Proc. COMPSAC '78, 60–65.
Ousterhout, J. K. (2018). A Philosophy of Software Design.
Sculley, D., et al. (2015). "Hidden Technical Debt in Machine Learning Systems." Advances in Neural Information Processing Systems 28 (NeurIPS 2015).
GitClear. (2025). AI Copilot Code Quality: 2025 Data Suggests 4x Growth in Code Clones. Industry report; 211M changed lines analysed.
Zhu, Y. C., Tsantalis, N., & Rigby, P. C. (2026). AI-Generated Smells: An Analysis of Code and Architecture in LLM- and Agent-Driven Development. arXiv:2605.02741. Beyond the findings quoted above, the paper reports a "Volume–Quality Inverse Law" (code volume predicts architectural smells, ρ = 0.94), that more capable models tend to produce denser, more bloated code, and that requirement specificity had no statistically significant effect on smell counts (p > 0.8).