Why I built a $5/month uptime monitor as a solo dev

metaindie

Over the course of my career I have spent a lot of time inside enterprise observability tools. The big ones — Dynatrace, SolarWinds, Site24x7. They are serious software. APM, root-cause analysis, AI-driven anomaly detection, the works. They do things a $5 per month tool will never do. They are also overkill for 90% of the people who buy them, and dramatic overkill for any solo dev or small team running a side project.

Spending enough time inside enterprise monitoring gives you a clear view of what these tools actually cost to use. Not the bill — the cognitive overhead. Setting up monitors that work. Tuning alerts so they are not noise. Reading dashboards that surface what you actually need. Onboarding a new engineer into a system that takes a week to understand. For a team of 50 with a dedicated SRE function, that overhead is fine. For a solo dev who just wants to know if their site is up, it is absurd.

So I built Upwatch. One product, one job, $5 per month. No AI. No correlation engine. No 30-step onboarding. Here is your data — your site is up, or it is not — and you move on.

I am Keith. I have spent my career on the operational side of monitoring, mostly watching what actually works and what does not when systems are under load at 2am. I built Upwatch on the side, alone, because I wanted to see what monitoring looks like when it is not trying to be everything. If you want to follow along or get in touch, I am on GitHub.

The first frustration is feature bloat. The big observability platforms started as one thing and grew into 30. APM bolted onto uptime. Logs bolted onto APM. AI bolted onto logs. Mobile telemetry bolted onto everything. Each addition is individually defensible — a board slide somewhere justified every one. Combined, you end up with a product that takes a full quarter to deploy and another quarter before anyone on the team can confidently change its config. The product is no longer for the user. It is for the procurement cycle that buys it.

The second is configuration overhead. I have watched teams spend an entire afternoon setting up a single useful alert in a tool one of them had been using for 2 years. The alert needed three conditions, two notification channels, and a suppression rule for the deploy window. The documentation was 40 pages of conceptual prose before the first concrete example. By the time the alert was wired up correctly, two of the people in the room had forgotten what they were trying to detect in the first place. For a solo dev running a side project at night, that path is a non-starter. The tool that requires an afternoon to configure correctly does not get configured at all.

The third is the pricing experience. Per host. Per seat. Per integration. Per data point ingested. Per dashboard. Quotes instead of prices. “Contact sales” on the things that should be a checkbox, and a 5-day SLA on getting an answer back. By the time a small team has finished pricing one of these platforms they have spent 6 hours of calendar time they will never get back, and the quote ends up at $400 a month for what is functionally “tell me when my site is down.”

What I built instead is a series of trade-offs, not a list of features. I check from 4 regions, not 12. Four regions catches roughly 95% of real-world failure patterns — single-region outages, ISP routing problems, a botched deploy that breaks one geo — and 12 regions doubles infrastructure cost for a long tail of edge cases that almost nobody in my audience hits. If you need a 12-region monitoring net you are not the person I built this for. You are running something with revenue per minute, and you should be paying for one of the big tools.

The minimum check interval is 30 seconds, not 5. Faster intervals do not improve signal quality in any way I have been able to measure across the last year of running this thing. What they do is amplify noise — transient connection blips, slow DNS, a remote endpoint having a 2-second hiccup that resolves before the next check. A 30-second interval gives you end-to-end detection inside a minute after consensus, which is well inside the “fast enough to matter” window for almost every internet service that is not a payment processor.

The trade-off I care most about is consensus alerting. The worst monitoring experience is not missing an outage — it is getting paged at 3am because a single region had a flaky network for 90 seconds and the tool fired a P1. Upwatch will not open an incident on a single failing region by default. It waits for agreement across the regions you have configured, on the principle that the only thing worse than no alert is the wrong alert at the wrong hour. Once you have lived through enough false 3am pages, you cannot go back to a tool that does not work this way.

Here is what Upwatch does not do. It does not do APM. It does not do log management. It does not do tracing. It does not do AI-driven anything. It does not auto-discover your infrastructure. It does not draw service maps. It checks your endpoints, opens incidents when they fail across regions, and alerts you on the 1 or 2 channels you actually read. If you need the rest, you need a different tool, and you will pay 50 to 100 times more for it. If you just want to know your site is up, this is built for you.

Pricing is $5 a month for unlimited monitors. The free plan is genuinely free for 10 monitors, forever, no trial countdown. The pricing exists because the product is small, the infrastructure is cheap to run, and I have no investors to please. If it works for you, that is what you pay. If it does not, the free plan covers most side projects without you ever needing to put in a card.

Upwatch is still small — a few hundred users, no team, no funding, 1 person doing the work. I am building it in public, posting changes and the occasional postmortem as they happen. If any of this resonates and you want to follow along, I am on GitHub. If you want to try it, the free plan is genuinely free. Either way, thanks for reading.