MarkdownEditor-1.0.6.dmg looked like a release. It was signed, packaged, and sitting exactly where a release artifact should sit. Then I installed it and saw the old app.

A harmless request started the loop: make the Markdown viewer feel less like a build artifact and more like a Mac app. Hide the Markdown markers in reading view. Borrow some restraint from Tailwind Typography. Give the document a page treatment. Build a signed DMG. Ship the thing a person would actually install.

Progress reports arrived in the expected shape. Files changed. Version numbers moved. The agent had commands, diffs, and confidence. On paper, the work smelled finished.

Then the screen disagreed.

At first, that looked like a design failure. Maybe the new styling was too subtle. Maybe the preview path was broken. Maybe macOS was launching an old copy from /Applications. Maybe I had failed to quit the running process before replacing it, one of those tiny rituals that can make an otherwise competent person feel like an apprentice clerk.

Each explanation was plausible. None survived contact with the build.

Once we opened the loop, compile errors were sitting underneath the calm packaging step. The release build had been failing, while the packaging loop continued to produce DMGs containing the previous binary. We were not iterating on the app. We were iterating on the ceremony around the app: fresh labels, stale contents.

In an agent workflow, that distinction matters more than it used to. A human who says “I built the DMG” has a familiar range of failure modes: optimism, skipped steps, a bad assumption about what the command did. An agent can fail more politely. It may have run a command, found a file in dist/, signed a wrapper, and produced something official-looking without ever proving that the installed application contains the code under discussion.

Release engineering has always offered hiding places for stale state: old app bundles, DerivedData, staging directories, cached dependencies, version strings updated in one file but not another, signing and notarization steps that validate the wrapper rather than the user’s experience. Agent speed just makes those hiding places easier to pass at a run.

Externally, the signed artifact had all the markers of legitimacy. The screenshot did not.

So the workflow changed. We forced a clean rebuild, bumped the version again, and added a visible status-bar version marker so the running app could testify on its own behalf. When MarkdownEditor-1.0.6.dmg was rebuilt and notarized, the evidence chain included the thing a user could see: the version label, the marker toggle, the card-on-canvas treatment, the changed preview.

Validation moved from pipeline to artifact. A command finishing is not a release. A file existing is not a working installer. A notarized DMG is not a changed application. For UI work, the screenshot is not decorative feedback after the work; it is part of the work.

An aesthetic request exposed the release bug precisely because visual work has an oracle. Does it look different? Is the preview alive? Can the installed app identify itself? UI work is often treated as subjective, but in this case it cut through every false positive.

Build said yes. Installer said yes. The screen said no.

Developing with agents is not frictionless when it is going well. It is closer to working with a tireless assistant who can handle several errands at once and still send the final package to the wrong address if nobody checks the label.

Decide what counts as evidence before the work begins. For a library, that may be the test suite. For an API, a real request and response. For a UI, a browser or app window under your eyes. For a release, the installed artifact launched from the place a user launches it, carrying a visible marker that proves which build is running.

An artifact can lie less if it carries its papers.

A painterly editorial collage for The DMG lied, showing the concrete objects and system relationships around dmg package, version marker, and screenshot check.
DMG package, version marker, and screenshot check.

From that session, the most useful validation artifact was not a log. It was the installed application telling us its own version. That tiny marker turned “I still do not see it” into a release check.

Delete the old package before building the new one. Install the package before believing it. Make the running app identify itself. Then look at the screen.

Chris Chabot · January 2026