What's the point of retrying an API call?

Published on January 5, 2026

Reading time: 4 minutes

Monday morning, we all still recovering from the weekend. My coworker calls everyone to show the bug he found and how this was breaking our application. He asked me to open our retry implementation when he says:

I kid you not, I was in shock.

How do you discuss with a point that diverge from common sense? This was something I wasn't prepared, that's not what retry mechanisms are for, and at the same time, what are retry mechanisms even for?

The most obvious reason is that HTTP is really not that reliable. Having failed requests due to infrastructure issue makes up for the majority of our errors, It's nothing we can control, infrastructure working on a large scale of load show us that 99% uptime is not that much.

So even though you have a very stable application, the connection to database, the ingress controller, the load balancer, the gateway, all those pieces of software that make our modern infrastructure introduce chaos to the system.

How we can control that chaos and increase the performance of an application? The simplest solution we have been relying for ages, was simply to retry. What is the best alternative here? Your application takes 3 seconds longer to load or showing an error page to the user?

Well that's my understanding of the problem, how I see the issue. Therefore, my view of it. How to explain it on a discussion where Newrelic was not finding errors, why would we need to retry? I tried to bring the argument that we don't see them because we DO retry right now, so this will not be visible.

By the way, this is even harder to track when your UI is making the requests, your users might be facing issues that you are completely unaware of.

Having retries in the frontend is paramount and the reason why UseQuery is growing in popularity, they handle out of the box all the caching, retry, and state of your requests. The difference between a high level application to a side project, that little detail that makes all the difference ✨

such an incredible software

Currently the main pain we are feeling in our project is to connect to database, somehow azure is not the best at it (??), so most of the time a retry can fix the problem, we can ignore the error and move along, some seconds slower but still kicking.

Again, how to measure it? Could be skill issues? Definitely, should not have been so hard to track it, and sometimes you need to be reasonable on where you spend your money, or credits. I'm not here to track something that did not fail.

So I researched on reasons to not remove the commonest pattern in software development. What strike me was an article saying: "not all errors deserve to be retry". That's a point that I was overlooking, simply retry is not enough. I've applied some rules to our retry

Simple changes that prevent us from attacking our own server, and good practices in general. I really like the idea of challenging predefined concepts on our industry, but only when we are open to discuss it, Like How I fixed TDD and how you can too.

Now I know that we can make it even better with circuit breakers and other microservices, but how this is implemented is a subject for another post.

RELATED ARTICLES

hello word! My first post here

Published on January 1, 2026

This is my first post about finally killing the overengineered blog dream and building something simple as hell: Cloudflare for hosting, Go with templ for HTML, Tailwind for CSS, and a dumb but effective bash script to turn markdown into static pages in a minute. No bloated generators, no AWS circus, no SEO-punished JS nonsense—just write, build, ship 💥. The design leans into Neobrutalism, inspired by post-war brutalist architecture: raw, unapologetic, and not here to look pretty 🤘.

READ MORE →

How I fixed TDD and how you can too

Published on February 11, 2026

Two years ago I was afraid to touch working code. Today I delete 1k lines, merge, fix, deploy, and sleep well. This is about how I stopped overengineering tests, killed excessive mocking, and built a suite I actually trust 💥. Tests shouldn’t slow you down—they should let you ship without flinching.

READ MORE →