The Day the Internet Is No Longer Built for Humans: When AI Tokens Become the ‘New Oil’.
On May 28, 2026, I opened my laptop and saw three headlines that rewired my brain. I don’t mean that metaphorically. I read them, closed all my tabs, stared at the wall for about five minutes, and then pulled up a blank document.
Three simultaneous announcements reveal AI tokens are becoming a tradeable commodity, the internet is being rebuilt for machine-to-machine traffic, and agents are getting their own payment rails. It’s a structural shift.
The first one: Reuters reported that China’s Shanghai Futures Exchange is designing a derivatives market for AI tokens [Source]. The second: Cloudflare told TechCrunch that non-human internet traffic will exceed human traffic by the first half of 2027 — bots already account for 31% of all HTTP requests [Source]. The third: Visa announced an investment in Replit, accompanied by something called the “Trusted Agent Protocol” — a system that lets AI agents verify their identity and complete payments autonomously [Source].
Individually, any one of these is a solid news day. Together, they told me something I hadn’t fully processed: the digital economy is being restructured from the ground up, and almost nobody outside of infrastructure teams is paying attention.
Tokens Are Becoming a Commodity — And Wall Street Noticed
Here’s a number that stopped me cold: DeepSeek V4 Pro charges $0.003625 per million tokens for cache reads. OpenAI’s GPT-5.5 and Anthropic’s Claude Sonnet? Over $0.31 per million on the same metric. That’s an 87x gap [Source]. And here’s the punchline: 80 to 90 percent of tokens consumed by real-world AI agents are cache-read tokens, according to Val Bercovici, Chief AI Officer at WEKA, a company that provides high-speed storage for exactly this kind of workload [Source].
The reason isn’t just pricing strategy. DeepSeek’s architecture is genuinely built differently. They introduced Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA) to slash KV-cache usage by 90 percent across their 1-million-token context window. They use Multi-head Latent Attention (MLA) to offload heavy data payloads from GPU memory into cheaper storage tiers. A 1.6-trillion-parameter model needs just 5.48 GB of high-bandwidth memory to hold a million-token context loop. A comparable Western architecture chokes at 89 GB for the same load.
I started this piece thinking the story was about Chinese AI catching up. I ended up finding something stranger: DeepSeek’s architecture was shaped by U.S. export controls that cut them off from Nvidia’s best GPUs. They didn’t just find a workaround — the workaround produced a cost structure so superior that it’s now forcing the entire industry to rethink how tokens are priced, served, and traded.
And “traded” is no longer a metaphor. The Shanghai Futures Exchange is designing derivatives contracts for AI tokens. CME Group — yes, the Chicago Mercantile Exchange — is working on GPU compute futures. ICE, the company that owns the New York Stock Exchange, is doing the same [Source].
Think about what that means. A token — the atomic unit of AI computation — is being transformed from a metered service into a tradeable financial asset. Businesses will be able to hedge their AI costs the way airlines hedge jet fuel. Hedge funds will be able to go long on GPT-5.5’s API price while shorting open-source alternatives. This is electrification all over again. Electricity wasn’t a commodity when Edison built Pearl Street Station. It became one once the grid standardized it and futures markets gave it a price discovery mechanism. AI tokens are on the exact same trajectory, compressed into two years instead of fifty.

Fig 1: API Token Pricing — Cache Read Cost per Million Tokens (USD)$0.0036DeepSeek V4 Pro$0.87/M outputDeepSeek$30/M outputGPT-5.5$15/M outputClaude Sonnet87x gap80–90% of agent tokens are cache reads — DeepSeek’s architecture targets exactly thisCSA/HCA compression + MLA memory offloading = 87x cheaper on the metric that matters mostSources: DeepSeek V4 Technical Report, VentureBeat, OpenRouter rankings — May 20267–17x cheaper87x on cacheMIT license
The Internet Is Being Rebuilt for Machines, Not Humans
I remember sitting in a conference room in 2019, listening to a cloud architect explain why they needed reservation-based pricing for server instances. The reasoning was straightforward: human traffic is predictable. People sleep. People have work hours. You can forecast demand six months out.
AI agents violate every assumption in that model. They don’t sleep. They don’t request one thing at a time. A single agent task can spawn dozens of sub-agents, each querying hundreds of databases, calling dozens of APIs, and reading megabytes of context — all within seconds, and then vanishing without a trace. Tia White, the general manager for Amazon OpenSearch Service, put it plainly to TechCrunch: “They spike without warning, they go idle without notice, and enterprise needs search that keeps up without paying for empty or idle compute” [Source].
AWS’s response was to decouple compute from storage in OpenSearch Serverless. The old architecture meant you always had at least one instance running — like paying for a parking spot whether or not your car is there. The new one scales compute up in seconds when an agent triggers a task and back down to zero when it’s done. You pay for exactly what you use, down to the second.
Cloudflare’s numbers make this concrete: bots accounted for 31 percent of all HTTP traffic in the last six months. AI crawlers, search engines, and assistants made up roughly a quarter of all bot requests. Cloudflare senior product manager Lai Yi Ohlsen told TechCrunch point-blank: “Non-human traffic will exceed human traffic sometime in the first half of 2027” [Source].
The same pattern is playing out across the industry. Databricks and Snowflake are repositioning as AI memory and retrieval systems. Microsoft Azure is shipping updates specifically for agent traffic bursts and inter-agent memory sharing. Cloudflare launched Agent Cloud. The internet’s plumbing — originally designed for a world where one human makes one request and gets one response — is being ripped out and replaced with infrastructure that treats agents as first-class citizens.

Fig 2: Non-Human Internet Traffic Trajectory20232024202520262027HumanNon-Human ↗2026: 31%2027H1: >50%Infrastructure rebuild:AWS OpenSearch Serverless (compute/storage decoupling) | Cloudflare Agent Cloud | Azure Agent-aware updates | Databricks/Snowflake repositionSources: Cloudflare Radar, TechCrunch, AWS Official Blog — May 2026
Agents Aren’t Just Using the Internet — They’re Getting Their Own Economy
This is the piece I almost missed. I read the Visa-Replit announcement and initially filed it under “corporate partnership.” Then I re-read it.
Visa isn’t just investing in Replit. They’re building something called the Trusted Agent Protocol — a system that lets AI agents present verifiable credentials, declare their intent, and attach relevant customer information when making transactions [Source]. That’s not a payments integration. That’s a digital identity layer for non-human economic actors.
And it’s not just Visa. That same week, Robinhood launched agent-powered stock trading. Google announced an AI-powered universal shopping cart that follows your journey across the internet. Stripe deepened its partnership with OpenRouter, the token aggregator that just raised $113 million from Snowflake Ventures, Databricks Ventures, Nvidia’s NVentures, and Google’s CapitalG [Source].
I started seeing the architecture: a three-layer stack forming in real time. The bottom layer is tokens — becoming commoditized, priced, and traded. The middle layer is the network — being rebuilt to handle agent traffic patterns. The top layer is commerce — agents getting identity, payment rails, and the ability to transact.
Every layer reinforces the others. Cheaper tokens mean more agents. More agents mean more pressure to rebuild infrastructure. Better infrastructure means more use cases for agent payments. More agent payments create more demand for token derivatives as hedging instruments. It’s a flywheel, and it’s already spinning.
I don’t think I’m overstating this. The last time three layers of the digital economy restructured simultaneously was the early 1990s — when telecom minutes were deregulated, fiber networks were laid, and billing systems standardized. That restructuring created the substrate on which the consumer internet was built. What’s happening now with tokens, agent-native infrastructure, and agent payments is the 1990s all over again, except the primary user isn’t a person clicking a link — it’s an agent executing a multi-step workflow.

What I Don’t Know
I started this thinking I’d found a clean narrative about AI infrastructure investing. I ended up finding three open questions that genuinely trouble me.
First: can you actually standardize a token futures contract? Oil works because WTI crude is WTI crude. But a GPT-5.5 output token and a DeepSeek V4 Pro output token represent fundamentally different units of “intelligence.” How do you write a derivatives contract when the underlying asset’s quality is non-fungible? The exchanges are clearly working on this — but I haven’t seen a clean answer yet.
Second: there’s a tension between the growth of agent traffic and the collapse of unit pricing. If DeepSeek keeps driving cache-read prices toward zero, and if infrastructure companies are spending billions to handle more agent traffic — at some point the revenue model for “agent-native cloud” has to come from somewhere other than volume. Nobody has figured out where yet.
Third: geopolitics. DeepSeek’s 87x cost advantage is, at its root, a product of U.S. export controls that forced their engineers to build around hardware constraints. If sanctions tighten, or if Western compliance boards rule DeepSeek’s MIT-licensed weights off-limits, the entire pricing floor shifts again. The token derivatives market would have to price in not just supply and demand, but the risk of a model being sanctioned.
A token futures contract that has to account for geopolitics. I’ll be honest — I find that both fascinating and unnerving.
I spent most of my career watching software eat the world. This is different. This is the world rebuilding itself so software can have its own economy, its own infrastructure, and its own financial instruments. The biggest question isn’t whether it’s happening. It’s whether anyone is building the right things for the world that’s arriving.
Sources
- Ram Iyer — “Just like gold and oil, we’ll soon be able to trade AI token futures” — https://techcrunch.com/2026/05/28/just-like-gold-and-oil-well-soon-be-able-to-trade-ai-token-futures/
- Rebecca Bellan — “The internet is being rebuilt for machines” — https://techcrunch.com/2026/05/28/the-internet-is-being-rebuilt-for-machines/
- Ivan Mehta — “Visa invests in Replit to power agentic payments” — https://techcrunch.com/2026/05/28/visa-invests-in-replit-to-power-agentic-payments-for-developers/
- Matt Marshall — “How DeepSeek’s radical architecture is shattering Silicon Valley’s token moat” — https://venturebeat.com/infrastructure/how-deepseeks-radical-architecture-is-shattering-silicon-valleys-token-moat/
- Reuters — “China works on AI token futures market, sources say” — https://www.reuters.com/world/china/china-works-ai-token-futures-market-sources-say-race-with-us-2026-05-28/
- CME Group — “CME Group and Silicon Data partner to launch first compute futures” — https://www.cmegroup.com/media-room/press-releases/2026/5/12/cme_group_and_silicondatapartnertolaunchfirstcomputefutures.html
- ICE — “ICE and Ornn to Launch GPU Compute Futures Contracts” — https://ir.theice.com/press/news-details/2026/ICE-and-Ornn-to-Launch-GPU-Compute-Futures-Contracts/default.aspx
- DeepSeek V4 Technical Report — https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro
- SiliconAngle — “OpenRouter raises $113M to bring order to enterprise AI inference routing” — https://siliconangle.com/2026/05/26/openrouter-raises-113m-bring-order-enterprise-ai-inference-routing/
- Stripe — “Stripe partners with OpenRouter” — https://stripe.com/newsroom/news/openrouter-and-stripe
- AWS — “Introducing next-gen OpenSearch Serverless for agentic AI” — https://aws.amazon.com/blogs/aws/introducing-the-next-generation-of-amazon-opensearch-serverless-for-building-your-agentic-ai-applications/
The Day the Internet Is No Longer Built for Humans: When AI Tokens Become the ‘New Oil’. was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.