Link Decay Prediction in Affiliate Marketing: Turning “Alive” URLs into a Time Series Monitoring Problem

Author(s): Hernan M Originally published on Towards AI. Key Takeaways Affiliate link health isn’t binary, even if most dashboards force it into green/red. A link can be “up” and still be quietly losing a third of your traffic. The strongest signal I’ve found is landing page arrival rate: across repeated tests, what fraction of attempts actually land on the intended page. Logged over time, that’s a time series with recognizable pre-failure shapes — gradual drift, volatility spikes, and geo-specific rot. Those shapes imply different features: slope for gradual decay, rolling variance for pre-collapse, and segmentation for geo rot. This starts to look like condition monitoring, not classic URL classification. The industry impact is non-trivial: Trackonomics estimates link rot costs affiliate marketing at least $160M annually, and reports it affects “almost half” of pages with affiliate links (impact.com). What’s missing is the predictive layer — and the hard parts are labels, domain shift, and false-positive cost. Same metric, different failure shapes — why one-size-fits-all “broken link” checks keep losing the plot. The failure nobody catches in time: when the offer URL is “up” but the funnel is quietly dying Last quarter I noticed a pattern that still bugs me. A publisher I know spent two weeks optimising creative, placements, and audience targeting on a high-volume affiliate campaign. Everything looked fine in monitoring. The offer URL returned a valid status, the redirect chain “worked,” and the uptime checks were green. But the money didn’t match the effort. When we pulled the test logs and computed landing page arrival rate, it had slid from 94% to 61% over 11 days. That drop wasn’t a single outage. It was a slow leak — one that a binary “alive” check will never flag. What I didn’t expect was how long this can run before anyone calls it. Teams will blame creative fatigue, auction pressure, or “bad traffic” long before they suspect the link itself is degrading. And the economics are already ugly without this nuance. Trackonomics’ scan across 7,000+ pages and 25 major publishers found link rot affects almost half of pages with affiliate links, with 3–10% of live affiliate links affected, and an estimated $160M in annual commissions lost (impact.com). So yes, broken links matter. But the failure mode I’m seeing now is worse: links that are “working” while the funnel quietly dies. Turning redirects into a measurable signal: attempts in, landings out — everything else is just excuses. Why reactive link monitoring has a ceiling (and why “HTTP 200” is the wrong success metric) Reactive monitoring is mature. It’s good at telling you what’s broken right now. Most link checkers will catch obvious failures: bad status codes, SSL errors, redirect loops, and hard geo-blocks. They’ll also validate permanent redirects — useful, because permanent redirects can lock in mistakes and create long-lived chains that are painful to unwind (findredirect.com). That tooling is necessary. I’m not arguing otherwise. The ceiling shows up when “success” is defined as “the request didn’t error.” Affiliate links fail partially: throttling, intermittent bot defenses, inconsistent redirect behavior, or a destination that sometimes resolves and sometimes doesn’t. Picture a link sitting at 73% landing page arrival rate and falling. Every binary check still passes, so the dashboard stays green, and the publisher keeps sending traffic. Tools like LinksTest log the full history of tests per URL over time — the raw infrastructure that makes a predictive layer theoretically possible, even if that layer doesn’t exist yet. The rub is that reactive checks treat each test as independent. But decay is a trajectory. If you’ve built any time series monitoring, you know where this is going. The question isn’t “did it work once?” but “is the system’s behavior changing?” To get there, you need to stop treating a link test as a one-off probe and start treating it as repeated trials you can aggregate into a time series — then look for the kinds of anomalies that only show up in sequences. HTTP 200 is a weak success metric; time-series features are how you catch the slow leaks. The signal: landing page arrival rate as a time series (three decay patterns that look “healthy” until they don’t) Here’s the core technical move. Each test run against an offer URL is a Bernoulli outcome: did the full redirect chain resolve to the intended landing page, yes or no? Run that test N times per day — across agents, geos, and times — and you can compute a daily arrival rate. Now you’ve got a time series. And this is where the anomaly detection framing fits better than “broken link detection.” A lot of link decay is a collective anomaly or interval anomaly: sequences that are wrong in aggregate even if individual points don’t look crazy (Anomalo). In practice, what I found is that three decay patterns show up repeatedly. They look similar to anyone staring at a line chart, but they demand different features and different operational responses. Pattern A: gradual linear decay (the slope is the early warning) This is the slow bleed. Arrival rate drops 2–3% per day for a week, and nobody panics because “it’s still mostly fine.” Operationally, it often correlates with staged throttling, progressive geo restrictions, or a redirect dependency getting slower and flakier. Not always — but often enough that I treat it as a first-class pattern. The feature that matters is simple: the slope of a rolling 7-day linear regression on arrival rate. You don’t need deep learning to get value here; a slope threshold plus a persistence rule catches a lot. If you want a more formal baseline, change-point detection is a good mental model. CUSUM-style detectors are explicitly designed to accumulate small deviations until they become statistically meaningful, which maps nicely to “death by a thousand cuts” degradation (MDPI). A concrete (hypothetical) example: imagine 200 tests/day. Day 1: 94% (188 successes). Day 7: 78% (156). Day 11: 61% (122). That’s not noise; it’s a trend. The publisher implication is straightforward. If you can […]

Liked Liked