Even More On Coding Agents

· 10min

You might have been wondering what has happened to my blogging enthusiasm. Have I run out of the steam after the initial spree? While the backlog of ideas to write about is full, the answer is simple: I’ve been experimenting with coding agents and I have to say—I’m having a blast!

Previous posts were focused on technical and security aspects of using coding agents. This time I’m going to share my perspective after using the technology extensively for several days.

First, second, and last reason this post exists. Also gives you deadline to write it for free.
First, second, and last reason this post exists. Also gives you deadline to write it for free.

Way Different Coding Experience

A brief bit of background about me: It’s been several years since programming was my day-to-day focus. While I’ve been fortunate to keep in touch with current technologies, my hands-on coding skills have inevitably become a bit rusty as my daily agenda was mostly driven by the calendar meetings 1.

So, naturally, I was really excited to get back to the keyboard, thinking about the good times when the code runs smoothly and the frustrating moments of chasing down bugs caused by a wrong conditional. I somehow skipped the phase of researching how people are using coding agents, decided on the project and just get to the work.

After initial cautious steps reading and manually accepting every agent code suggestion, I’ve settled to using it in “auto-accept” mode for file changes. And quite quickly it become clear that working with it will be nothing like what I was used to.

The best framing I’ve found: you become the code reviewer, the agent becomes the developer2. It boils down to how much you enjoy doing code reviews. Since receiving and giving code reviews was one of the best parts of the developer job, adapting to this workflow was natural for me. I can also understand why folks who find joy in hand-wiring everything themselves might object.

Even though counting lines of code is always a bad idea, I couldn’t resist and checked the stats:

DateLines AddedLines Removed
2025-10-072011165
2025-10-081527558
2025-10-101997793
2025-10-111246120
2025-10-121046342
2025-10-1328031155
2025-10-1419461210
2025-10-152083503

These numbers are misleading in predictable ways. The early project phase inflates additions—laying foundation, establishing structure, writing initial tests.

Still, lines added is mostly represented by reviewed code that passes tests and works3. The numbers confirmed what I felt: consistent non-trivial daily output.

Sharing this isn’t a subtle attempt to convince you just how productive a coding agent can make you—we’ll get to that later. My motivation was to form a genuine opinion about how useful it really is for day-to-day work.

Let me share the details about how I’ve been using it.

Actual Workflow

Firstly, I’m doing everything locally as it feels like it vastly improves the speed of iteration. Since we are doing code review all the time, you need a good way how to read diffs and navigate around the codebase. I think that pushing to remote and reviewing in browser UI gives you much worse experience and highly recommend to find a way how to avoid it.

Having a good setup for code navigation helps a lot. Usually, I do many iterations of quick back-and-forth asking the agent to update the behavior and jumping around the code to check the code changes align. Besides go-to keyboard shortcut, tools like quick way to show only modified files give you even more boost.

The essential part of iterating effectively is to find a way how to review incremental changes done by the agent. I’m using git interactive staging via lazygit to isolate the code already reviewed and marked as “good enough” from the usually wild intermediary result.

Whenever it seems like the code changes might be working, I’m running tests or the service itself in a dedicated development container so I do not have to worry about accidentally executing malicious code.

Modern IDE - neovim, lazygit and command line is all you need
Modern IDE - neovim, lazygit and command line is all you need

Besides the technical tooling side, it seems useful to explicitly keep track of what we are working on and what’s ahead of us. I regularly ask the agent to write down the session progress to worklog.md shared across sessions. The task list is part of the file as simple checkbox list—no fancy issue tracker for now.

Lastly, I often use the agent to discuss the architecture first. I usually prepare at least short draft with focus on principles I would like to follow and ask the agent to suggest the solution and write it down as a decision record. Then we iterate on it for a while, usually me asking clarifying questions to get the grasp of the proposal. Once the document is ready, I use it as starting point when asking for the implementation itself.

Bring The Skill, Get The Thrill

The agent’s usefulness is proportional to your domain knowledge. I learned this the hard way.

Inspired by a podcast talking about coding agents, I decided to try Pulumi for infrastructure management—without actually knowing anything about it! Two days of running in circles followed. The agent and I crafted detailed decision records, discussed trade-offs, implemented solutions. Everything felt seemingly productive except the implementation still not working.

This period ended once I finally got fed up, read the actual docs, and realized our carefully crafted approach was fundamentally wrong. As an example, the agent had reasoned strongly that using also kustomize and bare kubectl on top of Pulumi makes perfect sense. To my great surprise, such setup is completely redundant as Pulumi handles all of that.

Contrast that with earlier in the week. I was implementing a service I’d built variations of many times before. It was a blaze. I knew precisely what I wanted and how it should be structured. The agent delivered exactly what I needed, and the code worked.

The pattern is clear: when you know what good looks like, you can get the good output out of it. When you don’t, it produces plausible-sounding nonsense you can’t evaluate. The default output right now is mediocre at best—verbose documentation stuffed with bullet points like we’re gunning for promotion in a big corp.

Code quality is usually acceptable for scripts but needs explicit guidance to avoid ugly patterns. The key competence isn’t writing code as that’s cheap and quick, it’s managing the complexity of what’s being produced and recognizing when it’s gone off the rails.

Stack mediocre solutions on top of each other and you hit the complexity breaking point. It’s technical debt accumulation, now exponentially faster.

This is great news for senior people. The agentic workflow supercharges you when you know the drill. Instead of spending precious time writing code, you can think—about high-level architecture, about what to do next, about how features fit the bigger picture. You review, you guide, you design.

But it’s likely terrible news for newcomers. There’s still no shortcut to understanding how things actually work. The trap is real: skip the hard part and you’ll build a house of cards without realizing it. The tool can help you learn—use it for tailored examples, for exploring concepts—but resist the temptation to leap ahead of your understanding.

The agent can babble a lot, prune!

On Productivity Gains

100x or even 10x overall productivity promises are bullshit. Yet there are specific tasks where the gains are real and I hope to never write them by myself again.

If you find yourself:

  • writing a quick automation script
  • complaining about missing documentation instead of addressing gaps
  • or manually creating structures for encoding and decoding

then you are wasting your time.

Another amazing thing is the ability to experiment fearlessly. I vividly remember the planning discussions when we were forced to make educated guess because the time to develop the prototype to test the idea was simply too long. What a joy I have in these situations when I can instead ask the agent to wire up the prototype, take a look at ugly implemented yet working version, tinker with it and afterwards makes much better decision.

Moreover, it is much easier to automate. Whenever I catch myself doing something over and over, I ask the agent to create a script to do it for me with a hard stop to abandon if it’s not done in couple of minutes. For example, I’m using a convention that commit message starts with the part of repository it affects4. After being annoyed for a while, I asked the agent to write me git hook to prepare the title.

All written by the agent. I might even start to like bash

The hook inspects staged files, extracts directory names from the monorepo structure (services/tools/packages), and auto-prefixes commit messages with the affected component names.

#!/bin/bash

COMMIT_MSG_FILE=$1
COMMIT_SOURCE=$2

# Skip for merge, squash, and amend commits
if [ "$COMMIT_SOURCE" = "merge" ] || [ "$COMMIT_SOURCE" = "squash" ]; then
  exit 0
fi

# Get the list of changed files (staged files)
CHANGED_FILES=$(git diff --cached --name-only)

# Extract directories from changed files
DIRS=$(echo "$CHANGED_FILES" | grep -E '^(services|tools|packages)/' | cut -d'/' -f1,2 | sort -u)

# Count how many different subdirectories are affected
DIR_COUNT=$(echo "$DIRS" | grep -v '^$' | wc -l)

if [ "$DIR_COUNT" -eq 0 ]; then
  # No files in services/tools/packages, exit without modifying
  exit 0
fi

# Get current commit message
CURRENT_MSG=$(cat "$COMMIT_MSG_FILE")

# Skip if message is empty or just comments
if [ -z "$(echo "$CURRENT_MSG" | grep -v '^#')" ]; then
  exit 0
fi

# If there's only one subdirectory, use its name
if [ "$DIR_COUNT" -eq 1 ]; then
  SUBDIR=$(echo "$DIRS" | cut -d'/' -f2)
  PREFIX="$SUBDIR:"
else
  # Multiple subdirectories
  SUBDIRS=$(echo "$DIRS" | cut -d'/' -f2 | tr '\n' ',' | sed 's/,$//')
  PREFIX="$SUBDIRS:"
fi

# Check if prefix is already in the message (handle comments too)
FIRST_LINE=$(echo "$CURRENT_MSG" | grep -v '^#' | head -n1)
if ! echo "$FIRST_LINE" | grep -q "^\\$PREFIX"; then
  # Preserve comments and structure
  echo "$PREFIX $CURRENT_MSG" >"$COMMIT_MSG_FILE"
fi

As already mentioned, the productivity goes out of the window when working on areas where you lack the knowledge. Another questionable area is likely the gains in niche areas like low level performance optimization.

Currently, there are two bottlenecks.

I hit the session usage limit regularly. Yes, it can be solved easily by throwing more money at it. But usually I use the time to either do more reviewing and testing or simply take a coffee break. Because funny enough, the biggest bottleneck seems to be myself.

Reviewing the code from morning till evening drains the energy a lot. Couple of days I just felt really exhausted and called it the day. It is no joke to seriously design third decision record in a row. I could asked the agent to do more on that day. But it would not be productive as the generated code would be waiting for my review the next day.

I’m not really sure how people are running multiple agents in parallel. It feels like you end up in situation with the whole team yelling at you waiting for your code review. At which point you either give up and LGTM on +/-5000 line changed patches or put your code review hat on and resolve them one by one.

Let me finish with some speculations. I think this is great time for people looking to create. If you are working in a big team, my suggestion would be to split into smaller units that can move independently. You can iterate and explore faster, build end-to-end solutions more easily, deliver bigger scopes.

The coding agent workflow rewards autonomy and punishes coordination overhead. Every approval gate, every handoff, every coordination point kills the advantage.

It’s never been more feasible to do so much with so few people.

Footnotes

  1. In hindsight, I should have pushed back harder

  2. And it’s turned out I’m not the only one!. As Steve puts it:

    And so like I don’t mind the idea of like writing specifications for six AI agents to go write some code and then have me code review it. Like people like man don’t you love code writing code and hate doing code review? And I’m like I kind of like planning stuff and doing code review.

  3. as long-term gerrit and stacked diffs workflow fanboy

  4. to help with navigation in codebase organized as a monorepo