Using AI to Turn Sentry Errors Into Pull Requests Every Hour
For a long time, going through Sentry felt like spring cleaning a garage that never stays tidy. You open it to see the pile of tasks before you, and set out to finish the task. Eventually, you close it again until next year, having only organized or donated a few things.
With Sentry, it wasn’t because the issues are particularly complicated, or difficult to address. In fact, most of the errors Sentry tends to surface are small UI glitches, edge cases, null pointer errors, or incorrect assumptions in one branch of a handler. None of these are particularly dramatic, and each one would probably take ten or twenty minutes to fix.
But the problem is, none of them are urgent on their own.
And when something bigger is in progress, those small issues just sit there and accumulate. While Sentry keeps collecting them faithfully, the backlog continues growing. You tell yourself you will deal with it when things slow down.
But the fundamental problem of being a part of a small engineering team: they never slow down.
With AI becoming a more integral part of my job, I realized that this problem was the perfect testing ground for automation.
It was:
Highly repetitive
Well scoped and structured
An issue of lack of time, rather than complexity
And with autonomous AI agents becoming more powerful and reliable, I wanted to see if I could find a way to give this process to AI.
What I Built
I wanted to build a system that would do three things:
Fetch the latest Sentry issues from the last sixty minutes.
For each issue, clone the repository and find the root cause.
Generate a proposed fix and open a pull request in GitHub.
For this to work, two things needed to be true.
First, the code quality had to be good enough.
To be honest, this wasn’t something that worried me.
Modern LLM models are very capable of producing code at the level of a junior engineer. Entire companies have been built on that premise. If you give a model enough context about your repository and your conventions, it will usually produce something reasonable.
In our case, once we connected our tools to Claude Code and fed it our patterns (via Skills), it was producing surprisingly solid results almost immediately.
Second, the results had to be reliable.
There are two very different ways to approach this problem.
One approach is to hand an agent a broad instruction: go check Sentry, fix issues, open pull requests. In that world, the agent decides how to fetch data, how to track state, how to handle retries. It might succeed one time and fail the other, as is the non-deterministic nature of LLMs. That is the nature of LLMs, and unfortunately, not something you want when trying to run repeatable work that demands consistency.
The other approach is to separate concerns more aggressively.
Use LLMs for the discovery and solution parts of the automation. For example, let them analyze the error, or propose a fix. That is the creative, emergent work that LLMs are good at.
But put everything else that should be predictable and repeatable on rails.
Transparently, as a founder, I used our tool Friday AI to automate this. You could absolutely also wire something similar together with cron jobs, a workflow engine, or a stack of scripts. The reason I used Friday is because it made it way easier for me to pull that orchestration together.
In practice, what that means for me is that the workflow itself is defined ahead of time. There is a configuration that outlines:
When the workflow runs.
How it fetches Sentry issues.
How it records which issues have already been processed.
How information is passed from one step to the next.
Which agent is responsible for which task.
The LLM does not decide the shape of the workflow at runtime. It operates inside a predefined structure.
In Friday, that structure is generated from natural language prompts (ie. “Build me a system that will take Sentry issues, analyze them for errors, and propose fixes in PRs.”), but once it is generated, it is no longer probabilistic. It becomes a defined sequence of steps with explicit handoffs and fixed order. The state tracking is deterministic.
What It Looks Like When It’s Running
Once the workspace was set up, Friday would generate around three to four pull requests per day.
Some were straightforward wins that I could merge immediately. Others were close, but missing context. A few were technically correct but not aligned with our preferred patterns.
For example, if the system does not have an explicit preference encoded, it will choose the most common path. A classic case is database access. Raw SQL can be a completely valid solution, but if your codebase has invested in typed-safe query tooling, raw SQL is the path of higher risk. So it’s not wrong, it just raises the chance of subtle mistakes.
This was an important lesson for me: once you solve orchestration, the next thing you notice is consistency.
Not just “does it do the work,” but “does this match how we want to write code here.”
This ability to understand and deliver on preferences is something I’ve improved over time through additional context via skills in Claude Code.
Where We Are Now
This automation now generates a few pull requests a day.
Some get merged immediately, others need small tweaks. And I’d say about half of them get closed out. That is expected.
What this automation unlocked for me is the follow up and review of the long tail of small issues. It guaranteed that they would be processed steadily, without getting in the way of progress for the big, important work.
The bottleneck moved from “do we have time to look at this?” to “is this worth merging?” And I think that’s the better bottleneck.
I still review every change the same way I would review any human teammate’s code. Does it fix the root cause? What assumptions is it making? Is it following expectations of our processes?
The true impact of this work is less about “automation for the sake of automation,” and more about getting things off my plate that I’ve never had the time to get done.
You can try building this flow yourself on Friday AI.


