9 Comments
User's avatar
Tzur Vaich's avatar

Switching to lower-cost LLMs may be part of the solution, but it is not that straightforward. A weaker model can easily end up costing more once you factor in rework, extra supervision, longer workflows, and lower-quality outputs.

I believe the real path is different. The “send a task and forget it” approach, especially with an unlimited budget, does not work — and the same applies to multi-agent setups. Breaking work into sub-tasks, giving only a high-level task description, and leaving AI agents to sort everything out by themselves can quickly become inefficient and expensive if there is no clear process behind it.

It is like having an empty closet at home — you will always find a way to fill it. The same applies to token budgets: people and tools will use whatever capacity they are given.

Without proper processes, clear task design, and a real understanding of which tool, LLM, thinking level, and task granularity are efficient for each type of work, there will always be waste. Users, tools, and models will improve, but the process is at the heart of making AI cost-effective.

Personally, I spend around £100/month of my own money on AI tools, and in my AI engineering work it gives me at least a 5x efficiency gain. That is the kind of ROI companies should be aiming to understand and measure.

KJD's avatar

I wouldn’t be surprised if the RIF’s labeled as “AI” were just an easier message (and market signal) then “these layoffs were part of the normal employment randomness that happens every year in large corporations.”

Expect all layoffs in 2026 to come with the AI label attached. And, In 2027 it will be “AI is pushing us to hire more people” #AIsignaling

Darren Barrett's avatar

Have the environmental costs been factored in to the equation? The Macro Scale: Water Infrastructure Costs: Relying on billions of gallons of municipal water to keep AI hardware cool creates a massive infrastructural deficit.National Water Infrastructure Tax: Academic studies led by UC Riverside and Caltech reveal that expanding local waterworks to handle the spikes in peak data center usage will cost between $10 billion and $58 billion. A substantial portion of these multi-billion-dollar upgrades threatens to fall onto local taxpayers.Tech Enterprise Expenditures: Annual capital and operational spending purely on data center water infrastructure is projected to hit $797.1 million annually, tracking toward a cumulative $4.1 billion total over the second half of the decade.

Pushed too Far's avatar

The amount of pressure to use AI at big companies is astounding, especially considering a lot of the tools aren’t proven. This creates wasted cycles and churn just like incompetent colleagues. Today AI use is way more expensive than people but I assume as models improve, AI will eventually be cheaper.

Tor Magnus Rakvåg's avatar

I don’t know where Goldman got that number for daily coding agent cost, but it’s completely ridiculous. It misses by at least 5x, but probably more for most developers.

If we use Anthropic’s pricing, $13 is about what you can expect a single 15 minute turn of ordinary semi-complex coding to cost if you use their Opus model. Slightly less if you use the more error-prone Sonnet model.

Scenarica's avatar

A salary is a fixed cost you can predict. Inference is a variable one you can't. So "we replaced workers with AI" often means a company swapped a capped cost for an uncapped one, and the market clapped for the layoff before the compute bill arrived. Efficiency got priced as a story long before anyone measured it as a number. What looks like AI failing is just the measuring finally happening.

Conversations From The Cloud's avatar

Question Scott…. Is the only reason these companies are pushing to go public because the VC well is dry, and profitability targets are coming due?

Conversations From The Cloud's avatar

Tech CEOs next earnings call “We’ve spent 10 billion this quarter in Ai buildout and tokens but… we’re on pace to save 2 million in labor cost this year…… Your welcome”

Utsav Murudi's avatar

There will come a reckoning where the companies will go looking around and see who is the most efficient with their token usage. Who does the most amount of work for the least amount of tokens.