Saturday, 25 April 2026

Purpose vs Task: The Lens That Tells You Which Jobs AI Actually Eats

Ten years ago, radiology was the consensus “first job to go.” Computer vision had just become superhuman, and the core task of a radiologist — looking at scans — was the most obvious target. A decade later, AI has completely permeated radiology. Every department uses it. Every scan gets processed faster. And the number of radiologists has gone up.

Jensen Huang offered this as a throwaway during a Davos conversation with BlackRock’s Larry Fink, but it is the single most useful frame I’ve heard for thinking about AI and labor. The lens is simple: distinguish the task of a job from the purpose of it.

A radiologist’s task is to study scans. Their purpose is to diagnose disease. When AI compresses the task from minutes to seconds, the purpose doesn’t vanish — it gets more of the person’s attention. More time with patients, more time with clinicians, more scans processed per day. The hospital sees more patients, earns more revenue, and hires more radiologists.

Same story with nurses. The US is short five million nurses. Nurses currently spend half their time charting and transcribing. Companies like Abridge are eating that task. The nurses don’t disappear — the bottleneck moves. More patients get seen, hospitals do better, more nurses get hired.

If all you can see is the task, every knowledge job looks extinct. If you look at the purpose, you notice that the purpose usually gets bigger, not smaller.

The industrial view of AI

Most people think AI is the model. Huang insists AI is actually a five-layer cake. Energy sits at the bottom. Chips and compute sit on top of energy. Cloud services sit on top of the chips. Models sit on top of the cloud. And applications — healthcare, manufacturing, financial services, the places where economic value actually shows up — sit on top of the models.

The reason this matters: every layer needs to be built before the one above it works. Last year, the models finally got good enough to support a real application layer. That’s why 2025 was the largest VC year in history, and why most of that money went to “AI native” companies in healthcare, manufacturing, robotics, and financial services. The model layer is subsidizing the application layer.

And the infrastructure beneath the models is enormous. A few hundred billion dollars in already. TSMC is building 20 new chip plants. Foxconn, Wistron and Quanta are building 30 new computer plants. Micron has committed $200 billion in the US. Trillions more to go. Huang calls it the single largest infrastructure buildout in human history. Not hyperbolically. Literally.

Why it isn’t a bubble

The word “bubble” gets used whenever a lot of capital moves at once. Huang’s test is simple: try to rent a GPU. Spot prices on Nvidia GPUs in every cloud are going up — not just the latest generation, but two-generations-old hardware. If the infrastructure were overbuilt relative to demand, spot prices would be collapsing. They aren’t.

The more interesting read: the bubble question is the wrong question. The right question is whether we’re investing enough to broaden the benefit. Right now AI usage is dominated by educated users in developed economies. That’s how every platform shift starts. The difference with AI is that it’s the easiest software to use in human history — a billion users in three years. If a country has electricity and roads, it can have AI. The open-model wave (DeepSeek, and everything that followed) means any country with local linguistic and cultural expertise can build AI that actually serves its own population.

For Europe specifically, Huang’s pitch was: your industrial base and your deep sciences are your moat. The US led the software era. AI is “software that doesn’t need to write software” — you teach it instead of coding it. That collapses the American advantage. Fuse Europe’s manufacturing strength with AI and the next layer — physical AI, robotics — plays to European strengths.

Three things to steal from this conversation

  1. Audit your role by purpose, not task. If most of what you do is the purpose (diagnosis, judgment, client relationship), AI makes you faster. If most of what you do is the task (charting, retrieval, prediction), your seat gets compressed. Know which one you are.
  2. Pick your layer. Energy, chips, cloud, models, applications — each has different economics, a different moat, a different timeline. Don’t build at the model layer unless you have a real reason to.
  3. Infrastructure is the bet. The buildout is measured in trillions and in decades. Pension funds, sovereigns, and retail investors who sit it out will feel left out. The ones who fund the energy, chips, and factories will own the compounding.

The line that stuck: “You don’t write AI. You teach AI.” That sentence alone rewrites a lot of assumptions about who gets to build.

Source: Jensen Huang and Larry Fink at the World Economic Forum

Friday, 24 April 2026

Be Claude's PM, Not Its Proofreader

There's a strain of AI discourse that treats "vibe coding" as synonymous with letting a model write your code. It isn't. Eric, a researcher at Anthropic and co-author of Building Effective Agents, draws the line where Andrej Karpathy drew it: you're only vibe coding when you forget the code even exists. Cursor and Copilot don't qualify. Most of what senior engineers currently do with AI doesn't qualify. That's the whole problem.

The reason it's a problem is arithmetic. Task length that AI can complete end-to-end is doubling roughly every seven months. Today that's about an hour. Next year it's a workday. The year after, a workweek. If your workflow assumes you will personally review every line of code the model produces, you are building a career on the losing side of an exponential. Something will have to give, and it isn't the exponential.

So the question is not whether to vibe code in prod. The question is how to do it without shipping garbage.

Eric's answer borrows from every manager who has ever existed. A CTO green-lights code they can't read. A PM accepts features they couldn't have built. A CEO signs off on financial models they couldn't reconstruct. These people are not incompetent, they've just found abstraction layers they can verify without reading the implementation. Acceptance tests. User flows. Spot-checks on load-bearing numbers. Engineers are the last white-collar profession that still prides itself on understanding the full stack down to the metal. That pride is about to become expensive.

The compiler analogy is the one to sit with. In the early days of compilers, developers read the generated assembly to make sure it looked right. At some point the systems got big enough that nobody bothered. The code didn't become less important, the abstraction just became trustworthy enough that reading underneath it stopped being a good use of time. Application code is heading to the same place.

Three rules make the transition survivable.

Rule one: vibe code the leaves, not the trunk. Every codebase has leaf nodes, features nothing else depends on, bells and whistles that aren't going to be extended or composed. Tech debt in a leaf node is contained. Tech debt in your core architecture compounds forever. Human review stays mandatory on the trunk. Leaves can be trusted to Claude. The one class of problem today's models genuinely can't validate — is this extensible, is this clean — doesn't matter when nothing depends on the code.

Rule two: be Claude's PM. Ask not what Claude can do for you; ask what you can do for Claude. When Eric ships features with Claude he spends fifteen to twenty minutes collecting context into a single prompt, often through a separate planning conversation where Claude explores the codebase, surfaces the relevant files, and agrees on a plan. Only then does he hand the artifact to a fresh session and let it cook. The quick back-and-forth "fix this bug" loop is how you get mediocre code. A junior engineer on day one would fail the same prompt. Treat the model the way you'd treat that new hire: give it the tour, the constraints, the examples, the "here's how we do things."

Rule three: design for verifiability before you write the code. Anthropic recently merged a 22,000-line change to their production reinforcement learning codebase, written heavily by Claude. This was not a prompt-and-pray operation. Days of human work went into requirements and guidance. The change concentrated in leaf nodes. The extensible pieces got full human review. The team designed stress tests for stability and built the system with human-verifiable inputs and outputs, checkpoints that prove correctness without needing to read every line. That's the template. If you can't describe what "correct" looks like from the outside, you can't vibe code the inside.

The payoff isn't just saved hours. It's a lower marginal cost of software. When a feature costs one day instead of two weeks, you start shipping features you would never have started. You attempt system rewrites you would have dismissed as "not worth it." The cost curve reshapes what's worth doing at all. And that is where the real leverage lives.

Two caveats worth holding.

First, vibe coding in prod is not for the fully non-technical. Being Claude's PM means knowing enough about the system to ask the right questions and catch the wrong answer. The press coverage of leaked API keys and exposed databases describes a real failure mode — people who had no business running production systems were running production systems. The answer is not to ban vibe coding. The answer is to know what you're doing.

Second, today's caveat about tech debt will keep shrinking. Claude 4 models, even in their first weeks inside Anthropic, earned trust that 3.7 didn't. More of the stack will move inside the "safe to vibe code" bubble every quarter. The leaves will spread.

The uncomfortable framing: in a year or two, if your process still requires you to personally read every line of code, you are going to become the bottleneck on your own team. The models will happily produce a week's worth of work in an afternoon. The question is whether you've built the muscle — context, leaf-node discipline, verifiable design — to absorb that output, or whether you're still proofreading assembly while the rest of the industry ships.

Source: Master Coding Agents Like a Pro (Anthropic's Ultimate Playbook), Eric, Anthropic

 

Jensen Huang's Real Job Is Not Building Chips

Thursday, 23 April 2026

AI Is Getting Cheaper — Fast. Here's the Data.

Every few months I get asked the same question: "Is AI actually getting cheaper, or is that just hype?" The answer is: yes, dramatically, and faster than almost any technology in modern history. Here is the data, plotted two ways.

Chart 1 — How many tokens you get for $1

In 2020, one US dollar bought you about 17,000 tokens (roughly 12,000 words) of the best AI model available (GPT-3). Today, one dollar buys you 800,000 tokens on GPT-5. That is ~48× more AI per dollar in six years.

In 2020, one dollar bought you about 17,000 tokens of the best AI (roughly 12,000 words). Today, one dollar buys 800,000 tokens~48× more AI per dollar in 6 years.

Green bars get taller each year because you are getting more output for the same money. Growth charts feel intuitive in a way that price-declining charts do not — so this is usually the version I lead with.

Chart 2 — How much 1 million tokens costs

The flip side of the same coin. A million tokens of the best AI cost $60 in 2020. Today the same workload costs $1.25 — roughly 50× cheaper.

Flip side of the same coin: a million tokens cost $60 in 2020, now costs $1.25~50× cheaper. Faster price decline than computers, electricity, or solar ever managed.

For context, that rate of price decline is faster than:

  • Computers (Moore's Law doubling cost-performance every ~2 years → ~8× per 6 years)
  • Solar panels (~10× cheaper per decade)
  • Electricity, cars, steel, aluminum — pick any major industrial technology, AI is beating it.

The data

YearBest AI modelPrice / 1M tokensTokens per $1vs. 2020
2020GPT-3$60.00~17,000
2023GPT-4$30.00~33,000
2024GPT-4o$5.00200,00012×
2025GPT-5$1.25800,00048×
2026 (today)GPT-5 / Claude Opus 4.7$1.25 – $1567,000 – 800,0004× – 48×

Why this matters for builders

The "cost per token" line does not feel revolutionary until you realise what it unlocks. At $60/1M you think twice about letting a user ask a question. At $1.25/1M you stop thinking about cost entirely and start asking "what if I ran 100 AI calls per user action?" — which is exactly how modern AI agents work.

The product patterns of 2026 (autonomous agents, document-heavy pipelines, real-time reasoning loops) were simply uneconomic in 2023. They are routine now because the floor fell out from under the price. And it is still falling.

Wednesday, 22 April 2026

The Cloud Ate the Robot

Physical Intelligence is two years old. It has not built a robot. It has built a model that controls other people's robots, hosted in the cloud, sending action commands over an API, with no model code running on the robot itself. Co-founder Quan Vang, on Y Combinator's Lightcone podcast last week, mentioned casually that "almost all of the robot evaluations we run at Pi today - including the complicated demos, making coffee, folding laundry, mobile robots navigating around - the model is actually hosted in the cloud. A real cloud. A data center somewhere." This single architectural choice has more implications for the next decade of robotics than any of the cooler demo videos they've shown.

Why this was supposed to be impossible. For twenty years, the first question any robotics customer asked was: what compute unit goes on the robot? It mattered because real-time control loops demand millisecond-level latency, the compute hardware you pick gets obsoleted every 18 months, and it bloats your BOM. The classical answer was "everything on-device," which meant robots were powerful, expensive, heavy computers with arms attached. Pi's answer is a systems-engineering trick called real-time chunking. The robot executes actions in chunks - say 100ms at a time. At the 50ms mark, before the current chunk is done, it requests the next one from the cloud with a continuity constraint so the transition is smooth. Inference happens in parallel with execution. Network latency is buried. The robot doesn't need onboard compute. Vang goes further: he has never seen the robots his model controls, and intentionally avoids knowing how they work internally. The layers are decoupled.

This is the unbundling of robotics. Here is what robotics used to require to build a company: your own customer relationships, your own hardware platform, your own autonomy stack, your own safety certification, your own data collection infrastructure, your own everything. Vertical integration wasn't a choice - it was table stakes, because the intelligence layer didn't exist as a component you could buy. Pi has explicitly externalized the intelligence. They open-sourced the Pi 0 and Pi 0.5 model weights - the same weights they use internally. The result is that a new founder can now walk into an industry with:

  • Off-the-shelf robot hardware
  • Pi's model handling perception, planning, and control as a cloud API
  • A workflow they understand better than anyone else
  • Scrappy data collection for their specific deployment

This is the playbook Vang actually walks through in the interview, and it's worth copying verbatim. Starting a vertical robotics company now looks like:

  1. Understand an existing workflow deeply. Not conceptually - operationally. Where does labor bottleneck?
  2. Identify the single insert point where a robot saves the most cost or unblocks the most capacity.
  3. Use cheap hardware. The model is reactive; it compensates for hardware imprecision. You do not need a $100k precision arm.
  4. Set up data collection and evaluation in the real deployment - not in a lab demo.
  5. Get to mixed autonomy. Humans take over when the robot fails. This is okay. The point isn't perfect autonomy; it's economic break-even.
  6. Once you're break-even per robot, scale the fleet. That's when the flywheel spins.

Two YC companies are already running this exact playbook. Weve folds diverse laundry in a real laundromat (not a demo) with clothes it's never seen, while people walk by outside. Ultra packs Amazon-style soft pouches in an actual e-commerce warehouse, running the full workday - same video starts bright outside and ends after sunset. Both built their autonomy on top of Pi's model. Weve reportedly got to a deployable laundry-folding system in two weeks.

What makes this the Cambrian explosion moment. Vang is careful academically but personally confident: he believes thousands of vertical robotics companies are about to exist, one for every workflow that currently has a labor shortage. The reason this is credible and not just vibes is that the startup recipe no longer requires a 20-year robotics PhD. It requires someone scrappy who can do system integration, understand a specific customer workflow, and collect data for that workflow. These are operator skills, not ML-researcher skills. Pi's role in this is not to win every vertical. It's to be the intelligence layer that lets a thousand other companies start. Their success is defined as their model performing useful work on somebody else's robot, in a warehouse they've never seen, for a customer they don't know.

The broader pattern. Every major technology wave gets unbundled at some point. Compute unbundled from hardware into the cloud. Payments unbundled from banks into Stripe. Distribution unbundled from publishers into the app stores. When the intelligence layer of robotics unbundles - and Pi has pretty much just made that happen - the sector moves from a capital-intensive, vertically-integrated, enterprise-only business into something that looks a lot more like the normal startup economy. If you've been waiting for the right moment to build something in the world of atoms, this is the setup Vang is handing you. The hard part is no longer the robotics. The hard part is the workflow, the customer, and the discipline to get to economic break-even before you scale.

Source: The GPT Moment for Robotics is Here - Lightcone, Y Combinator

Tuesday, 21 April 2026

Your Job Is Not Your Task

Jensen Huang told a story at Stanford last week that should be required listening for anyone planning their career or running a company through the AI transition. It's about radiologists, and it's the cleanest mental model I've heard for thinking about which jobs AI eliminates and which it doesn't.

Ten years ago, one of the most influential computer scientists of his generation - and one of the actual founders of modern AI - told the world that radiology was the worst career a young doctor could pick. AI was about to read scans better than humans within a decade. He was completely right. AI now permeates every aspect of radiology. Almost every scan is assisted by AI. The volume of scans being read has gone through the roof.

The number of radiologists also went up. Not down. Up.

Huang's framing for why is the line worth tattooing on the inside of every founder's eyelid: the purpose of your job and the tasks that you do in your job are related, but they are not the same thing.

The radiologist's task is to read scans. That got automated. The radiologist's purpose is to diagnose disease, work with patients, and partner with doctors. That demand only grew - more patients can be admitted, more conditions caught, more revenue per department, so hospitals hire more radiologists. The flywheel only collapses if the people who confused the task with the purpose start steering young doctors away from the field. Which is exactly what happened. There is now a shortage of radiologists in the United States, caused largely by the warning that the field would die.

This same trap is being set right now in software, design, marketing, sales, and law.

Huang volunteered the second example on himself. "What I do for a living is typing and talking. Both have been automated to superhuman level by AI. And I'm busier than ever." His engineers tell the same story. NVIDIA's coders all use agentic AI. The good ones - the ones being promoted and poached - are the ones who are best at working with the agents. The bottleneck used to be writing the code. Now the bottleneck is having the next idea, because the agents have already finished what you asked them to do and they're "perpetually harassing you in text" asking what's next.

Then Huang says something that explains why the productivity gain doesn't compress headcount the way most pundits assume. Pundits assume NVIDIA needs to ship a fixed amount of code per year - say a billion lines - and if AI lets a thousand engineers do what ten thousand used to, then nine thousand are out. But that's not how it works. A billion lines of code was the most they could do with that many people in that much time. The cap was always human bandwidth, not ambition. Huang wants to write a trillion lines of code. He'd hire more people to write a trillion lines, not fewer to write a billion.

This is the practical version of the same point: task automation doesn't shrink the org if the org's purpose is bottlenecked by ambition rather than by hours. The companies that contract are the ones whose purpose really was just to do the task at fixed throughput. The companies that grow are the ones whose ambition was always being constrained by the throughput, and now isn't.

The single quote founders should put above their desk:
"It is unlikely that most people will lose a job to AI. It is most likely that most people will lose their job to somebody who uses AI."

Two practical reads of that:

1. If you're hiring, the test isn't "have they used AI?" It's "are they faster than the humans who don't?" Treat AI fluency the way you treated Excel fluency in 2002 or English fluency in 1995 - non-negotiable for anyone in a role where the agents are now reachable.

2. If you're working, separate your task from your purpose. Then ruthlessly delegate the task to the agents and reinvest the saved hours into the purpose. The radiologist who learned to use AI now reads more scans, catches more disease, and is the most valuable hire in the department. The radiologist who refused is being told the job is being restructured.

Congressman Ro Khanna's contribution at the same panel sat alongside this and is worth taking seriously: the productivity gains will not be evenly distributed unless someone makes them so. Past industrial revolutions ended with more jobs but spent twenty miserable years getting there. Workers' bargaining position during the adoption phase determines whether the gains end up only with capital. That's a policy question, but it's also a culture-of-the-company question for any founder reading this.

The radiologist parable doesn't close the gap.

Source: U.S. Leadership in AI with Jensen Huang and Congressman Ro Khanna - Stanford GSB

Monday, 20 April 2026

When Latency Becomes Oxygen

Will Bodis runs Phoneley, a voice AI company that just raised a $16M Series A from Bessemer. The company handles millions of calls a month across hundreds of verticals, and 80% of the people on the other end don't know they're talking to a machine. By the end of this year, Bodis predicts that number will be close to 100%.

The interesting part isn't that voice AI works now. It's where Bodis says the bottleneck has moved.

For two years - back when "voice AI" wasn't even a phrase - latency was the thing everyone obsessed over. Phoneley was an early customer of Groq's fast-inference chips precisely because of it. Today Bodis calls latency "oxygen": you need it, but nobody talks about it anymore because everyone has enough. Companies that can't deliver low latency just aren't in the conversation.

The new game is statistical optimization of outcomes. That's a different sentence than "build a chatbot that talks to your customers." Phoneley's pitch is that they don't just answer your phone - they continuously surface what's working and what isn't. He gives one concrete example from earlier in the week: a customer changed one question in their voice AI's script and outcomes improved 5%. Phoneley told them which question to change, and could prove the lift statistically.

That's the position Bodis is staking out, and it's worth attention because it generalizes far beyond voice. In any AI-driven product, the layer above the model is usually where the durable business is built.

Bodis started with small businesses - the pest control company, the pizza shop. Not because the economics were great, but because they gave him constant, frequent, brutal product feedback he couldn't have gotten chasing one enterprise deal that took a year to close. Four or five months of that compressed iteration, and his first call center customer paid more than every small business combined. Then he moved upmarket.

Most founders flip this order - they chase the enterprise logo first because the deal size is bigger, and end up shipping based on their own assumptions because they can't get enough at-bats. Bodis's discipline: find the customer who will tell you what's broken every week, even if they can't pay much, and use them as the iteration substrate for the customer who eventually can.

The investor side of the same lesson

Bessemer didn't come from a deck or a banker. Caroline from Bessemer reached out after a LinkedIn post Bodis had written about doing 300-mile ultra-endurance bike races. The post was about commitment and being a founder. The conversation grew from there. A few months later, Bessemer preemptively offered the round - Bodis didn't shop it. He picked the investor the same way he picks employees: would I want to hire this person? If yes, take the offer.

Closing thought

There's a temptation in AI right now to think that the only game is to build the model. Bodis's career so far is an argument for the opposite: the model is becoming a commodity, and the durable position is the thing built on top of the model that knows how to measure, optimize, and improve a specific outcome. Voice AI is barely past the "books on the internet" stage. The companies that figure out the optimization layer for each vertical will own a lot more value than the ones competing on whose voice sounds most human.

Source: This Startup Built AI That 80% of Callers Think Is Human - Phoneley founder Will Bodis