Tesla To Make Humanoids, With Prototype Slated For Next Year, Says Elon Musk And Other Live Updates From AI Day

We are excited to bring you the Tesla Inc TSLA AI Day as it happens. The event takes place at the automaker’s Palo Alto, California headquarters and was slated to begin at 8 p.m. ET and is seeing a little delay. 

Details on announcements are murky but CEO Elon Musk said in July that this year’s AI Day would have the sole aim of convincing top talent in the artificial intelligence space to join the electric vehicle maker.

There are expectations that Tesla will disclose progress on both “hardware and software” for training and inference. Loup Ventures analyst Gene Munster expects AI Day to be an extension of the Autonomy Day event held in 2019 and the Musk-led company is expected to show the prowess of its AI progress beyond the vehicle fleet.

On watch are also any updates related to the highly-anticipated training Dojo Supercomputer, which is expected to replace Tesla’s existing supercomputer.

The event takes place as the National Highway Traffic Safety Administration or NHTSA investigates the company’s Autopilot feature and the multiple crashes that took place involving emergency vehicles.

Follow the space below for live updates.

11:06 p.m. That’s all, folks. Thank you for joining us on this ride. Analysts can now begin their work and will likely be busy for days.

11:03 p.m. Musk thanks people for coming and for the great questions! Event ends with a call to join the team at Tesla.com/AI.

11:02 p.m. With hardware 4, next generation cameras coming up! Limits of current cameras not yet reached, says Musk in response to cameras being worse than the human eye. In the future, people will say we cannot believe we had to drive these cars ourselves. All cars will be automatic and electric, of course.

11:01 p.m. We should be worried about AI, Musk. Tesla is working on narrow AI. When you start getting to super human intelligence, all bets are off. That will probably happen. Tesla is working on useful AI that people love and is unequivocally good.

10:59 p.m. Musk again calls for people to join up! 

10:58 p.m. There are still some nets not using surround video, which Tesla is working on, says Musk. 

10:54 p.m. Question on unpredictable situations on the road — Before we introduce something into the field, it is run in shadow mode. Effectively the drivers are training the neural network says Musk.

10:53 p.m. If you wear a tee shirt with a stop sign, the car will stop. Musk has experimented with that!

10:51 p.m. Current FSD can achieve full self-driving better than humans. Hardware 4 will probably be introduced with the Cybertruck, says Musk.

10:49 p.m. Short to medium term economics of the Bot. Repetitive, boring tasks not highly compensated, how will it work? “We will just have to see,” says Musk.

10:46 p.m. Question on design of Tesla Bot. This is just gonna be Bot Version 1, but it needs to be able to do things that people do. A generalised kind of human bot. You could give it two fingers and a thumb. For now we will give it five fingers. It doesn’t need to have incredible grip strength, carry your bag — that kind of thing.

10:43 p.m. Auto labeling is critical to the self-driving problem — Musk. Car’s predictive ability is “eerily good.” The car can predict out of sight roads very well.

10:42 p.m. Tesla Bot question. We certainly hope this does not feature in a dystopian sci-fi movie. Trying to be as literal as possible. It could be your buddy too if you wanna have a beer. People will think of some “very creative uses,” says Musk.

10:40 p.m. Overwhelmingly data used for training is real world video from vehicles, says Musk. Simulations are for rare quirks, which is useful for accident reduction.

10:38 p.m. On Tesla utilizing bots, Musk says it's gonna start with work that is boring, repetitive or dangerous — work that people are least likely to do.

10:37 p.m. Simulator is very helpful in rare cases. The better Tesla cars become at avoiding accidents, the less the simulator comes into play, Musk says in response to a question.

10:32 p.m. I discourage the use of machine learning, says Musk. 99.99% of the time you don’t need it. You reach for machine learning when you need it. It might change when you got a humanoid robot that can understand normal instructions.

10:30 p.m. Dojo will be used for beyond car inference part of the training, says Ganesh. Dojo is a generalized neural network training computer, says Musk. CPUs and GPUs were not designed for training. “Let’s just ASIC the whole thing,” adds the Tesla CEO.

10:24 p.m. Question related to AI in manufacturing now. Parts of the Tesla system are completely automated, and some are completely manual. Most are automated. If we do not make a humanoid robot, someone else will and make sure it is safe. Volume manufacturing is critical for humanoid robot production so as to keep costs low, says Musk.

10:23 p.m. The prime directive for the system is “don’t crash,” which is same for every country. Cars are good at not crashing. Won’t even hit a UFO that just dropped out of the sky, claims Musk.

10:20 p.m. Question on geographic diversity in FSD data. Data from 50 different countries used for training. For training, Tesla picked the United States. Roads in Canada are different, says Musk. Extrapolation to the rest of the world later.

10:19 p.m. Some amount of audio is used for simulations, for example to gauge emergency vehicles, if people yell at the car — the car needs to understand it as well. These are required for full autonomy, Musk in response to a question.
10:17 p.m. Do you have plans to expand simulations to other parts of the company, comes question. Musk says we want to expand to being a universal simulation platform. Optimus is the code name for Tesla Bot! Simulation to be extended to it in the future.

10:16 p.m. Dojo will be operational next year, says Musk. Primary application initially is to train vast amounts of video. Reduction of training time is key.

10:15 p.m. On Dojo scalability and distribution, Bill says those problems are not fully solved yet. The difficulty is how to keep the localities. Clear path for Tesla applications though. Ganesh says modularity takes care of internal applications.

10:14 p.m. Musk is inviting questions now. Focus shifts to Tesla Bot. Question asked on publishing and open sourcing. Musk remains silent, crowd laughs. Says, well it is fundamentally expensive to create the system, somehow it has to be paid for. I am not sure how to pay for it if it's open source; unless people want to work for free. If other car companies want to license it, it would be cool. Not limited to Tesla cars.

10:12 p.m. Musk is talking about labor and the economy. “What happens when there is no shortage of labor?” But not right now, because this robot doesn’t work. “In the future work will be a choice.” says Musk. Is there an actual limit to the economy? Probably not. “Join our team and help build this!”

10:11 p.m. The Tesla Bot will be friendly and take care of dangerous, repetitive tasks. At a mechanical level, you can run from it and over-power it. Just to be safe! If you can run faster than 5 miles per hour you will be safe, jokes Musk. Is Tesla planning to replace its workers with these Tesla Bots? One cannot help but wonder! 

10:10 p.m. Tesla is arguably the world’s largest robotic company. Our cars are semi sentient beings, says Musk. Dojo and neural nets — makes sense to extend to the humanoid world. We will have a prototype next year. 

10:07 p.m. Ganesh says Tesla is recruiting “heavily” to continue its AI journey, and now hands over the stage to CEO Elon Musk.

10:06 p.m. Dojo will be the fastest AI training computer with 4x performance, 1.3X better performance/W and 5X smaller footprint compared to what exists, says Ganesh.

10:05 p.m. Driver stack takes care of multi host and multi partitioning. Tesla also has profilers and debuggers in its software stack. Vertical integration done. Modularity present up and down the stack.

10:04 p.m. Compiler is capable of handling loops and the likes. Stack is made up of an extension to PyTorch, generates code on the fly, which can be reused for subsequent execution. 

10:03 p.m. Model parallelism cannot be extended due to chip boundaries, but Tesla can extend it to training tiles and beyond.

10:02 p.m. Users have to change scripts minimally. The compiler uses multiple techniques for parallelism.

10:01 p.m. Now Ganesh is talking about software. Compute plane can be partitioned into Dojo processing units. A DPU is made up of 1 or more D1 chips and contains an interface processor and one or more hosts. Can be scaled up or down.

10:00 p.m. Ganesh is touting modularity. Two trays can give 100 PFLOPs of compute, but Tesla did not stop there but it created an ExaPod, which is capable of 1.1 EFLOP. A million nodes is what it took to do it.

9:58 p.m. 9 PFLOP is the unit of scale for Tesla’s system. Ganesh holds up an example to the audience, which bursts into applause. Tesla got its first functional training tile last week. 

9:57 p.m. New way of feeding power, vertically, through a custom voltage regulator module. Result, fully integrated training tile.

9:55 p.m. Tesla came up with an integration process using what it calls a Training Tile which strives to preserve bandwidth. Training tile has 9 PFLOP. Engineers came up with new methods to create this reality. 

9:54 p.m. D1 chips can be seamlessly connected and Tesla put 500,000 nodes without any glue to make their compute plane.

9:52 p.m. Tesla’s D1 Chip is a pure machine learning machine capable of 362 TFLOPs  and is designed completely in-house. 

9:51 p.m. The approach is modular. Computer A is made up of 354 training nodes capable of 362 TFLOPs.

9:50 p.m. Ganesh is describing the high-performance training node which is capable of 1024 GFLOPS and 512 GB/s in each cardinal direction.

9:49 p.m. Tesla wanted a top to bottom approach to scale up performance. Tesla’s smallest entity of scale is called a training node. Tesla wanted to address latency and bandwidth issues.

9:48 p.m. Dojo encompasses a large compute plane, extremely high bandwidth, and low latencies.

9:47 p.m. Tesla came up with a distributed compute architecture. Ganesh says it is easy to scale up compute but not so latency.

9:46 p.m. Ganesh Venkataramanan, Senior Director for Autopilot Hardware, describing himself as the lead of project Dojo. An insatiable demand for speed and capacity for training and that’s why Dojo came into being, says Ganesh.

9:45 p.m. Latency and Frame Rate are crucial. Milan is describing the cars' Dual SoC and the underlying infrastructure used to run the car. Tesla uses a massive data center infrastructure and has tools for the cars on the road. They are now barely shy of 10,000 GPUs.

9:44 p.m. Over 10 billion labels generated, as per Milan. He is talking about how Tesla generates its training data.

9:44 p.m. Milan Kovac, Director of Engineering for Autopliot, walks on the stage to talk about computing power.

9:42 p.m. Tesla uses scenario reconstruction to recreate its synthetic worlds that help train the autopilot. Neural rendering can enhance the realism. Simulation data involves over 371 million images.

9:41 p.m. Most data is created algorithmically instead of artists doing the work. This allows the company to create scalable scenarios.

9:40 p.m. Things needed to produce such simulations are accurate sensor simulation, photorealistic rendering, diverse actors and locations — as an example an animal on the road. Tesla has 2000 miles of road in simulations.

9:39 p.m. Such simulations help when data is difficult to obtain, this helps the auto pilot to train for situations which are likely but rare on roads in daily life: Ashok.

9:38 p.m. Ashok is showing an awe-inspiring computer-rendered clip, which looks exactly like a real life road.

9:37 p.m. In essence, Tesla can remove humans from labeling.

9:36 p.m. Tesla’s fleet produced 10 k clips in a week and automatically labeled them in one week, says Ashok.

9:34 p.m. The system as a whole can produce excellent kinematic labels, which is huge for Tesla. The company wants to produce million such clips. This is all an effort to remove the radar, which was done in 3 months.

9:32 p.m. Different clips can be stitched to produce an effective map of an area but this is more for labelling. The system is good enough to even gauge barriers, walls, hedges etc.

9:29 p.m. Auto labeling now being discussed by Ashok. Tesla collects video clips through its own or customer vehicles, AI then does the labeling.

9:27 p.m. In the beginning the labeling was in image space, which involved drawing over the image space, but Tesla has now graduated to 3D vector space and direct labeling in 3D. This gave Tesla a massive increase in throughput, but even this is not enough.

9:26 p.m. Andrej talks about how Tesla generates training data. Tesla was working with a third party in the past to get data sets and now the labeling has been brought in-house. Data labeling team is now 1000 strong and it works closely with Tesla engineers.

9:25 p.m. Andrej is back on stage. He is talking about neural networks and parameter setting and importance of “massive data sets.”

9:24 p.m. The approach Tesla is taking is somewhat akin to a classic Attari game but, admittedly, a game with multiple players.

9:22 p.m. Ashok uses a simpler parking illustration to make his point. It is tedious to design a globally optimised heuristic and that’s where neural networks come into play.

9:21 p.m. Tesla wants to use learning-based methods to solve city driving in places like India where things are far more chaotic on the road.

9:19 p.m. Autopilot is not timid and that is what makes self-driving possible. The system reduces knee-jerk braking and takes approaches that increase comfort.

9:16 p.m. Lane changes are done using thousands of searches in just seconds. Tesla’s system plans for overall traffic flow and not just for the vehicle involved. Ashok illustrates with a video.

9:15 p.m. Emphasis on safety and maximizing it, especially in city conditions, says Ashok. The key problems are non-convex action space and it is high-dimensional — car needs to plan ahead 10-15 secs.

9:14 Andrej handsover to Ashok Elluswamy, Director of Autopilot Software at Tesla.

9:13 p.m. Evolution in progress, improvements can still be made. Tesla is looking to improve latency and reduce expensive post processing.

9:10 p.m. Tesla’s camera approach can gauge depth and velocity without the use of radar, Andrej illustrates with a video.

9:08 p.m. Recurring neural network can keep track of road surfaces, which is a dynamic process. The network can keep track of traversals and in a way construct a HD map on the fly.

9:07 p.m. He says Tesla uses both a time and space based queue for keeping vehicles aware of road surfaces and markings as the vehicle waits.

9:03 p.m. Multi camera network struggle less with traffic on the road, especially when there are large vehicles: Andrej.

9:02 p.m. Andrej talks about transforming all of the images into a synethetic virtual camera using a special rectification transform to solve the problem of variations in camera calibration. This really transforms the vector space. Difference is night and day! It took some work though, says Andrej.

9:01 p.m. All of our cars are slightly cock-eyed in different ways: Andrej

9:00 p.m. Every single image piece broadcasts what it is a part of and that helps the vehicle discover things like kerbs on the road even when obscured by other vehicles.

8:58 p.m. Tesla wanted to take all the images from every camera and make multi-cam vector space predictions. Idea was to get a bird’s eye-view prediction. The problem was solved using a transformer.

8:56 p.m. At the time when it began, Tesla was doing prediction based on its HydraNet, but for FSD this was not enough. The discovery was made while the company was working on smart summon, says Andrej. Things that worked on the image level do not really work in vector space, he adds.

8:55 p.m. This HydraNets basically eliminates the need for multiple neural network backbones.

8:54 p.m. However, this moved to multi-task learning “HydraNets” to do just more than detection.

8:52 p.m. Processing involves multi-scale feature pyramid fusion, says Andrej.This helps the vehicle decide what it is seeing on the road.

8:52 p.m. Andrej talks about how neural networks have evolved over the 4 years he has been working at the company; earlier cars could only drive in a single lane.

8:50 p.m. Andrej talks about the vision component, made up of 8 cameras and provide 3D representation of its surroundings. Andrej likens making a Tesla car to "effectively building a synthetic animal from the ground up."

8:49 p.m. Musk gets straight to the point with hiring and invites Andrej Karpathy, director of Autopilot Vision and AI to the stage to kick off the session. 

8:48 p.m. Tesla is much more than a car company. Musk says it is arguably the leader in AI.

8:46 p.m. Thank you for the music…but let’s start already. Cannot wait! OK, Musk heard me quick, he is on stage now.

8:38 p.m. We have begun ….. perhaps….. Looking now at a video of a Tesla vehicle’s interior with “Full Self-Driving” emblazoned on the screen.

8:31 p.m. We are listening to some rather upbeat music, but there’s no sign of Musk & Co. 

Photo: Courtesy of Tesla

Market News and Data brought to you by Benzinga APIs
Comments
Loading...
Posted In:
Benzinga simplifies the market for smarter investing

Trade confidently with insights and alerts from analyst ratings, free reports and breaking news that affects the stocks you care about.

Join Now: Free!