The Ghost in the 'Wrong' Machine. The future of humanoid robots
We've built god-like minds for our robots. The problem is their bodies.
There is a scene that perfectly sums up the strange schizophrenia of our technological moment. It happened in early 2024. Tesla released a video of Optimus, its robotic messiah, folding a T-shirt. The camera, with that sterile laboratory aesthetic so popular in California, showed metal hands slowly picking up a black garment and folding it on a table. The gesture was slow, almost meditative. It was, in appearance, the ultimate domestication of the machine: the automaton turned housekeeper.
The video, of course, went viral. But then, the new theologians of our time, the frame analysts on social media, noticed something strange. An almost human hesitation, a tremor unbecoming of an algorithm. In the lower right corner of the shot, a human hand entered and left the frame, like a clumsy stage director. Suspicion turned to mockery when Elon Musk himself, chief prophet of this new religion, admitted sheepishly that, well, the robot was not acting autonomously. Not yet, anyway.
That poorly folded T-shirt is not an anecdote. It is the fold that reveals the truth: general-purpose humanoid robotics, as it is sold to us, is a spectacular magic trick. And to understand the trick, we need not look to engineers, but to a 17th-century French philosopher and his most scathing critic.
A Ghost with a PlayStation Controller
In 1641, René Descartes split us in half. He proposed that human beings were a strange amalgam of two substances: res extensa (the body, a machine of flesh and bone subject to the laws of physics) and res cogitans (the mind, an immaterial, thinking and free entity). The big problem with his theory, which tortured him until the end, was explaining how on earth the two communicated. How could a thought, an immaterial ghost, make an arm move?
Three hundred years later, in 1949, the philosopher Gilbert Ryle mocked this idea by coining one of the most brilliant terms in 20th-century philosophy: ‘the ghost in the machine’. For Ryle, Cartesian dualism was a ‘categorical mistake,’ a logical absurdity like visiting Oxford University and, after seeing the colleges, libraries, and laboratories, asking, ‘But where is the University?’ The mind, Ryle said, is not a spectral pilot driving a body; it is simply the sum of all the abilities and dispositions of that body.
The irony is so delicious that it almost seems to have been written by a screenwriter. Seventy-five years after Ryle’s rebuke, the vanguard of Silicon Valley has invested billions of dollars in proving that, in order to make a humanoid robot work in 2025, you do need a ghost in the machine.
The most blatant example occurred at Tesla’s ‘We, Robot’ event. There, Optimus robots not only folded T-shirts, but also served drinks, played games and posed with astonishing naturalness. It looked like the future, served on a silver platter. The reality, revealed by the company itself, is that much of that autonomy was a sham. It was teleoperation. In an adjoining room, off-camera, an army of very material ghosts, wearing virtual reality headsets and holding controllers, were pulling the strings. The robot was not an autonomous being; it was an extremely expensive puppet. The ghost in the machine exists, only now it charges by the hour and probably uses a PlayStation controller.
The Learning Casino and Real-World Revenge
Proponents of this technology argue that this is only a temporary phase. That the real leap forward will come from Reinforcement Learning (RL), and more specifically, Deep Reinforcement Learning (Deep RL). The idea is appealing: instead of programming every move, you create a computer simulation and let the AI ‘learn’ for itself through millions of trials and errors, receiving virtual rewards when it does something right. It’s like training a dog, but with infinite patience and a monumental electricity bill.
The problem is that this method has as much to do with reality as an online poker game has to do with surviving in the jungle. In the digital casino of simulation, the robot can afford to fail a million times to learn how to pick up an object. The cost of each failure is zero. In the real world, a single failure can mean a Ming dynasty vase smashed to pieces, a short circuit, or an amputated finger.
This unbridgeable gap is what engineers call the sim-to-real transfer problem. And this is where Moravec’s Paradox, that old unwritten law of robotics, comes back to laugh in our faces. We can get AI to compose symphonies or discover new proteins (tasks that seem to us to be the pinnacle of intelligence), but we fail miserably at teaching it to walk on a wrinkled carpet or open a jar of gherkins (tasks that a three-year-old child can master).
The reason is that the physical world is computational hell. Friction, gravity, elasticity, unpredictable light... every interaction with reality is a negotiation with a chaos of variables that no simulation can fully replicate.
Surprisingly, the “global reinforcement learning in robotics market size reached USD 1.82 billion”, and projected to reach an estimated USD 14.98 billion by 2033.
Investors, control and the single-hand problem
So, if the challenges are so fundamental, why do we see these spectacular demonstrations? Why are billions being invested in humanoids that are, at heart, little more than body double actors?
The answer lies with the audience. Those who sign the cheques are not usually experts in control engineering. A venture capitalist understands an exponential growth curve in software performance; they understand much less about the physical limitations of an actuator or the intractability of the contact problem in robotics. It is infinitely easier to sell a PowerPoint with the promise of ‘embodied general AI’ than to explain why a hinge remains an unsolved engineering problem.
What Tesla and other startups are selling is not a product, it is a narrative. A resurrection of the Cartesian dream: the promise that a software ‘soul’ (a giant language model, a neural network) can be downloaded into a body and, as if by magic, bring it to life and give it meaning. In fact, Tesla is now trapped in a single hand problem!
The human hand has 27 degrees of freedom and is controlled by 20 muscles in the hand and 20 muscles in the forearm, with most power developed by forearm muscles and intrinsic hand muscles crucial for fine control. Intrinsic muscles in the hand are essential for fine control and proprioception, crucial for tasks like playing piano or disassembling a car. Tesla Optimus’ hand had 22 dofs. All this requires 80% of engineering difficulty to replicate its versatility and dexterity in a robotic hand.
Manufacturing the robotic hand at scale is 100 times harder than designing it, according to Elon Musk, and makes this problem a huge and hierarchical one, since some muscles cannot be moved independently.
But as Gilbert Ryle warned us, this is a category mistake. Intelligence is not a ghost that can be transplanted. It is the result of a body and a brain that have evolved together over millions of years in a constant dance with the brutal and wonderful physics of the real world.
I’m not saying humanoids aren’t going to happen, but there are a lot of challenges to be solved before the economics of humanoids can work out. The progress is amazing but making the value larger than the cost is really hard: folks are going to have to nail both “very low cost robots” and “high productivity speeds”.
The robot that folds our clothes will come, probably. But it will not be the result of miraculous software installed in a pretentious mannequin. It will be the culmination of the work of those forgotten ‘plumbers’ of engineering who struggle with the friction, balance and fragility of a world that cannot be simulated. In the meantime, we will continue to watch a high-tech puppet show, applauding the ghost and pretending not to see the strings. And there’s also an important question left: do people really want humanoids at their homes?
We’ll see.




