Tesla robots work in factories, Musk: The degree of freedom of hands will reach 22 this year!

Latest update time：2024-05-06

Reads：

Jin Lei comes from Ao Fei Temple
Qubit | Public account QbitAI

The latest video of Tesla's robot Optimus is released, and it can already work in the factory.

At normal speed, it sorts the battery (Tesla's 4680 battery) like this:

The official also released what it looks like at 20x speed - picking, picking, picking at a small "workstation":

One of the highlights of the video released this time is that Optimus completes this work in the factory, which is " completely autonomous " and there is no human intervention in the whole process.

And from the perspective of Optimus, it can also pick up the crooked battery and place it again, focusing on automatic error correction :

Regarding Optimus’s hand, Nvidia scientist Jim Fan gave high praise:

Optimus has one of the most dexterous hands among five-fingered robots in the world.

Not only does its hand have tactile sensing, it also has 11 degrees of freedom (DoF) , while its peers basically only have 5-6 degrees of freedom.

And it's durable enough to withstand a lot of object interaction without constant maintenance.

And just in Jim Fan’s comment area, Musk also appeared and revealed an even more important news:

Later this year, the Optimus hand will have 22 degrees of freedom !

But let’s face it, the video showing Optimus sorting its own batteries is just an appetizer.

This time, Tesla rarely released details of the robot’s training.

Similar logic to Tesla cars

First of all, in terms of neural networks, we can know from the subtitles in the video that Tesla deployed an end-to-end neural network to Optimus to train the task of sorting batteries.

Because of this, the data used by Optimus only comes from the 2D camera and the tactile and force sensors of the hand, and directly generates joint control sequences.

Tesla engineer Milan Kovac further revealed that this neural network runs entirely on the robot’s embedded FSD computer and is powered by the onboard battery:

As we add more diverse data during training, a single neural network can perform multiple tasks.

In terms of training data, we can see that humans wear VR glasses and gloves and collect it through remote operation:

Regarding this point, Jim Fan believes:

It is very important to set up the software for first-person video streaming input and precise control of the streaming output while maintaining extremely low latency.

This is because humans are very sensitive to even the smallest delay between their own movements and those of the robot.

And Optimus happens to have a fluid full-body controller that can perform human poses in real time.

And Tesla Robots has extended this model to other tasks:

This scale also shocked Jim Fan:

To collect data in parallel, one robot is not enough, and humans have to work in shifts every day.

An operation of this scale may be undreamed of in an academic laboratory.

Not only that, but judging from the tasks Optimus are performing in the video, they are also diverse, including sorting batteries, folding clothes, and organizing items.

Milan Kovac said Tesla has deployed several robots in one of its factories, and they are being tested and continuously improved at real workstations every day.

All in all, Optimus trains based solely on vision and human demonstrations, which is somewhat similar to the logic of Tesla cars.

At the end of the video, the official also revealed another improvement in Optimus' capabilities - it can go further :

One More Thing

Jim Fan’s laboratory also released a new development in the past two days——

Let the robot dog walk on a yoga ball!

Its training method is completely different from Tesla Optimus. It is completely conducted in a simulated environment, and then migrated to the real world with zero samples, without fine-tuning and running directly.

The specific technology behind it is the team’s newly launched DrEureka , which is based on Eureka, the technology behind the previous five-finger robot pen-turning machine.

DrEureka is an LLM agent that can write code to train robots' skills in simulations and write more code to bridge the gap between difficult simulations and reality.

In short, it completely automates the process from new skill learning to real-world deployment.

Compared with the training methods of Tesla Optimus and NVIDIA robot dogs, Jim Fan also made a soul summary:

Teleoperation is a necessary but not sufficient condition to solve the problem of humanoid robots. Fundamentally, it doesn't scale.

And some netizens agreed with this:

So what do you think?

Reference links:
[1] https://twitter.com/Tesla_Optimus/status/1787027808436330505
[2] https://twitter.com/DrJimFan/status/1787154880110694614
[3] https://twitter.com/DrJimFan/status/ 1786429467537088741
[4] https://twitter.com/_milankovac_/status/1787028644399132777