Week 2: The Humanoid Building Your BMW, vs. The Terminator

May 11, 2026

When most people hear “humanoid robot,” the picture in their head is typically from a movie. Terminator. iRobot. Ex Machina. The humanoid is uncanny, capable, and either a threat or a friend, depending on the studio. The pitch from the loudest humanoid companies in 2025 is not far off that picture: an artificial general-purpose worker, in your home within a decade, on the factory floor sooner than that.

When a business reader who pays attention to industrial technology hears “humanoid robot,” the picture shifts. It is Figure’s robot, the one in the press releases, working alongside humans on a BMW production line in Spartanburg, South Carolina. It is Apptronik’s Apollo at Mercedes. It is a real machine, in a real factory, building a car a reader can picture buying. The story has moved from science fiction into something that sounds like industrial reality.

When an enterprise leader hears “humanoid robot,” the picture shifts again, and it gets smaller. The BMW and Mercedes deployments are real, and disclosure on the BMW program has now opened up: Figure and BMW have jointly released a single set of headline numbers from an 11-month Spartanburg pilot, roughly 90,000 sheet-metal parts loaded across about 1,250 operational hours, contributing to production of more than 30,000 BMW X3 vehicles. That is real progress. It is also a single coordinated vendor-customer disclosure, with no named integrator, no support contract terms, no multi-site data, and no customer-initiated reporting outside the joint announcement.

The deployment that does have customer-initiated reporting, named site, and multi-month throughput is Agility Robotics’ Digit, moving 100,000 plastic totes inside a fenced area at a GXO warehouse in Flowery Branch, Georgia. Those are the two publicly disclosed humanoid deployment stories in the world today with meaningful throughput numbers attached. Almost nobody outside the trade press is talking about either of them in detail.

These are three pictures of the same field. The general reader sees the Terminator. The business reader sees the Figure on the BMW line. The curious business person sees the Digit moving totes at GXO. They are all looking at the same industry, in the same year, and the gap between the three pictures is the most important thing about humanoid robots right now. My goal with this series is to begin to close that gap.

This is week two. Last week I said the post would be about what I read, what I noticed, and at least one disagreement called by name. The five sources that I dug into this week are:

Rodney Brooks’s blog, specifically his September 26 essay “Why Today’s Humanoids Won’t Learn Dexterity,” plus a Brian Heater interview where Brooks tells a story about a humanoid falling on its face at a cocktail party while two people across the room debated whether to push it over.
Agility Robotics’ November 2025 announcement about the GXO 100,000-totes milestone.
GXO’s 2025 annual report, where the word “humanoid” appears far less often than the analyst coverage of GXO would suggest it should.
Brett Adcock’s Time interview from October 2025.
A long-form interview with Adcock on the Shawn Ryan podcast, where he describes Figure’s separation from OpenAI in some detail.

A few things stood out.

The conversation.

The conversation, for the most part, is between vendors and the people who fund them. The most-cited content is capability demos, founder interviews, funding announcements, and analyst forecasts based heavily on the founder interviews. The vocabulary is “general-purpose,” “embodied AI,” “scaling laws for robotics,” “the foundation model moment for physical work.” The story arc is: capability is improving fast, demos are getting more impressive, mass adoption follows.

What the conversation is mostly not about: who deploys these things, what happens when one of them fails, what the integration looks like, what the support contract looks like, who insures the thing, what the data architecture looks like in production, what a fleet of them costs to run.

If you have watched any other enterprise technology wave from up close, you know the shape of this. The capability story is loud because capability sells the round. The deployment story is quiet because deployment is hard, slow, and unflattering to a vendor narrative built on near-term inevitability. I am not saying anything controversial by pointing this out. I am saying that an executive reading the humanoid discourse over the past several months has been reading the exciting half of a story whose messy half will eventually matter much more. The general reader is mostly reading only the exciting half.

The deployment evidence.

The Agility/GXO 100,000-totes story is the closest thing the field has to a customer-led deployment milestone right now. Named customer, named site, throughput number, multi-month operation. The report surfaced for two days in the trade press and disappeared.

Adcock saying “every home will have a humanoid within ten years” surfaced for two weeks.

I want to be careful about the gap I am highlighting here, because it is the most important thing in this post and I do not want to overstate it. The capability evidence in the field has improved continuously and dramatically. The deployment evidence has improved much more slowly. As of now, there are two humanoid programs with public, named-customer, multi-month operational data and meaningful throughput numbers: Digit at GXO and Figure at BMW. The GXO disclosure is customer-led and ongoing. The BMW disclosure is a single joint vendor-customer release of headline numbers from a defined pilot window. Both count. They do not count equally.

Apptronik has Mercedes signaled, with much less operational disclosure than either. Tesla Optimus has internal Tesla, which is a different kind of evidence and gets its own treatment below. Everything else is a video, and let’s face it, AI videos are pretty good now.

So the capability curve and the deployment curve are not the same curve, and most of what is being written conflates them. That conflation is doing work for the vendors and against the reader. A capability demo answers the question “could this machine, in principle, do this task?” A deployment answers the question “is this machine, in production, doing this task today, reliably, for someone who didn’t build it?” Those are different questions, and the field is treating answers to the first as evidence for the second.

The two curves diverge for reasons that anyone who has run an enterprise rollout will recognize. A controlled environment is not an operational environment. A scripted task is not a mission. Supervised operation is not autonomous operation. A six-week demo timeline is not a three-year production timeline. The list goes on, and every item on it eats months. The deployment curve looks slow because deployment is, in the literal physical sense, slow. The capability curve looks fast because demos are, in the literal physical sense, cheap.

The disagreement.

Brett Adcock said in Time that every home will have a humanoid within ten years, and he has said variants of this many times. Rodney Brooks has said the opposite, in detail, with technical specificity, citing forty years of trying to build robots that could do what humans do with their hands, and his claim is that the path the current generation of humanoid companies is on does not get there.

I think Brooks is much closer to right, and I think instinct supports him. Not because he is more famous, or because skepticism is the safer bet, or because forecasts that far out are usually wrong (they are). Because the specific reasons he names are the reasons enterprise deployments fail.

The dexterity problem is the difference between a demo and a job. A robot that can pick a single object off a single table in a single lighting condition under a single instruction is not a robot that can work a shift. The dexterity gap is not “almost there”; it is, by Brooks’s argument, structurally not on the trajectory that the current investment thesis assumes.

The data problem is the difference between a robot that worked once and a robot that works tomorrow. Every humanoid in production produces continuous high-bandwidth multimodal data. The pipelines, the storage, the labeling, the model retraining, the governance: none of that exists at the level the vendor pitch requires, and the silence around it in the founder narratives is its own data point.

The safety problem is not an abstraction either. The Figure whistleblower lawsuit filed in November alleges that Figure’s robot can produce more than enough force to fracture a human skull, that the safety roadmap presented to investors was quietly scaled back after the round closed, and that the engineer who flagged it was fired. I have no view on the merits of the suit. I have a strong view that any enterprise looking at human-adjacent humanoid deployment reads that filing and slows down. The discourse has mostly not slowed down, which tells you what the discourse is currently for.

The household pitch.

The most exciting version of the humanoid story is the humanoid in the kitchen.

Brett Adcock has said many times that every home will have a humanoid within ten years. 1X has announced a consumer humanoid called Neo, pitched for in-home use at around $20,000 or a $499-per-month subscription, squarely aimed at early adopters. Elon Musk has put Optimus at the center of Tesla’s long-term valuation story, with Musk repeatedly suggesting consumer pricing in the low five figures, cheaper than a typical Tesla car, and a stated ambition that Optimus eventually becomes a larger business for Tesla than cars. Whatever you think of those claims, they are the part of the humanoid pitch that has captured the most public imagination, the most retail-investor attention, and the most political airtime.

The household pitch deserves to be reviewed, and then positioned accurately.

There is a long-run argument for consumer humanoids. Developed-country demographics are aging and eldercare labor is in structural shortage in every major economy. Component costs in robotics have fallen and will continue falling. The household, in some sense, represents the largest total addressable market for a general-purpose physical machine. None of that is crazy.

Placed accurately, the gap between the household pitch and any plausible household deployment is even larger than the gap between the factory pitch and any plausible factory deployment, and that gap is already quite large. The factory case has two publicly disclosed multi-month deployments, one customer-led and one vendor-coordinated. The household case has zero. It has consumer announcements, preorder pages, demo videos, and pricing claims. It does not have a single named consumer with a humanoid working unsupervised in their home over a meaningful period, doing meaningful work, with any disclosure of how often the machine fails, how it is supported, who fixes it when it breaks, what happens when it falls on a child or a pet, or how the data it generates is governed.

The home, from a deployment perspective, is a much harder environment than the factory. A factory is engineered, the lighting is controlled, the surfaces are predictable. The humans on the floor have been trained on safety protocols, there is a maintenance team on site. A home has none of that. A home has a toddler, a dog, a staircase with an unusual carpet, a kitchen island the homeowner moved last week, and an internet connection that drops twice a day. The capability gap between “this machine works in a Tesla plant” and “this machine works in a Tesla owner’s house” represents an entirely different problem.

Tesla is the place this gets most interesting, because Tesla is the only humanoid program whose entire strategy is built around skipping the third-party deployment phase. Figure, Agility, and Apptronik need named industrial customers to validate their machines. Tesla does not. Tesla builds Optimus, deploys it in Tesla factories building Teslas, and points to that as both the proof and the product. The consumer story comes later, on a timeline Tesla controls, with no integrator ecosystem to build because Tesla owns both ends.

That is a coherent strategy, but also a strategy with a specific blind spot. A humanoid that works in a Tesla plant has been trained in, designed for, and supported by the most controlled, vertically integrated industrial environment in modern American manufacturing. What that tells us about whether the same machine works in someone else's factory, let alone someone's home, is not much. Tesla’s vertical integration is the source of its execution speed and the reason its deployment claims are the hardest to evaluate from the outside. The same property that lets Tesla iterate fastest is the property that makes the iteration least informative about anyone else’s environment.

So, the household pitch is a category. The consumer humanoid is a coherent long-term market. Tesla is doing something structurally different from the other vendors, but none of that changes the read of the evidence. The near-term deployment evidence is two industrial humanoid programs, one customer-led at GXO and one vendor-coordinated at BMW. The household humanoid in your kitchen, on the timeline being pitched, is a different conversation, on different evidence, that the discourse is currently treating as though it were the same conversation.

It is not the same conversation. It is a Grand Canyon sized gap.

What I keep thinking about.

In the Brian Heater interview, Brooks said: “Very few executives at humanoid companies have ever deployed robots.” I wonder if they have ever deployed anything to the enterprise. I don’t say that to be catty, but it’s an honest question from someone who has made a career of doing just that.

The companies raising the largest rounds are run by people who have not lived through what it takes to keep an asset in production, in someone else’s facility, with their name on the support contract. They have not been on a 2 a.m. call about a unit that won’t restart. They have not been in the procurement review where the deployed cost comes in at multiples of the unit price. They have not been in the safety meeting where the customer’s operations leader says no, you don’t get to put that thing next to my people until I see your failure-mode documentation, and the response is a slide deck that doesn’t have one.

That does not mean these founders are wrong about everything. It means that the operating model knowledge that makes a deployment work at scale is mostly not in the room when these companies are pricing pilots. It means that the gap between capability and deployment is not just technical in nature, but a gap in lived experience inside the companies building the technology.

If you have been around a technical revolution or two, you will recognize this pattern. The vendor who has never run the system at production scale prices the pilot like a demo, scopes the support like a SaaS contract, and discovers eighteen months later that the customer’s operations team has built an entire side organization just to keep the thing running. The vendor’s term for this is “implementation challenges.” The customer’s term for it is “the pilot graveyard.”

What I think I can write about usefully

This is the part of the field that the public conversation is missing, and it is the part I think I can write about usefully. Not because I know more about robotics than the people building them. I absolutely do not.

What I do know better than most is how enterprise technology gets deployed and adopted. I have watched the path from “capability demonstrated” to “asset producing reliable value in someone else’s operation” enough times to recognize when a story is telling me about the first half of that path and asking me to imagine the second.

The founder narratives, almost by construction, are not equipped to tell the deployment half of the story. The analysts are mostly downstream of the founders, and the trade press is downstream of both. The professionals who could tell that half of the story are mostly inside enterprises or large consulting firms, are mostly not writing, and when they do write, it is mostly in private channels.

So that is the part I am going to keep watching. Not the demos, the deployments. The integration contracts, support models, data architecture, safety cases, the fleet management software. The total cost past the unit price, the change management. The things that, in every previous enterprise technology wave, turned out to be where the value was either captured or destroyed.

What’s next.

Next week I want to spend time on what “humanoid” is doing as a category, because the word is being asked to cover four or five very different machines and the conflation is making the conversation convoluted. A bipedal logistics robot moving totes inside a fenced area is not the same machine, the same buyer, or the same timeline as a humanoid-form mobile manipulation robot working alongside humans on a factory line, which is not the same as a teleoperated training-mode unit, which is not the same as a defense-application humanoid (a category that deserves its own week and will get one), which is not the same as a Tesla-style vertically integrated consumer play, which is not the same as whatever the open-market home-robot pitch is trying to be. The word “humanoid” is being used to cover all of those, and the field is paying for it.

The list of things I want to read for that one is already getting long: vendor product pages from Figure, Agility, Apptronik, 1X, Boston Dynamics, Tesla, Unitree, Fourier; the IEEE Spectrum coverage of the category; the latest Brooks predictions essay.

One ask before I close.

If anyone reading this knows of a deployment story, in any humanoid program, with named customer, named site, and operational throughput numbers I have not already mentioned here, I would like to read it. Right now my list has two entries on it, one customer-led and one vendor-coordinated. That is part of the story.

Views are personal and do not represent any employer, past or present.

Notes from Adam Mattis

Discussion about this post

Ready for more?