Account Info
Log Out
English
Back
Log in to access Online Inquiry
Back to the Top
TSLA up 10%, $400 target: When will it hit a new ATH?
Views 661K Contents 112

Unveiling the Secrets of Dojo

avatar
Noah Johnson joined discussion · Sep 12, 2023 04:09
1. What is Dojo?
Dojo is a purpose-built supercomputer designed by Tesla in-house to train the full-self-driving (FSD) system that is installed in every Tesla vehicle.
Dojo can replace Tesla's current A100-based data center, which has 14,000 A100 chips and is the world's seventh-largest data center.
Elon Musk first mentioned Dojo at AI Day in 2019. Dojo aims to solve the problem of training massive amounts of video data. As a form of data center called ExaPOD, it has 3,000 D1 chips and a single-precision computing power of 1.1 EFlops.
It is expected that 40,000 to 50,000 D1 chips will be shipped in the 2023 fiscal year. The first ExaPOD was put into operation in July 2023, and it is expected that six ExaPODs (with 7.7EFlops) will be deployed to the Palo Alto data center in a short period of time. The target for Q4 2024 is to achieve a computing power of 100 EFlops for Dojo, which is equivalent to approximately 91 clusters.
Unveiling the Secrets of Dojo
2. Comparison between Dojo and existing computing clusters
D1 has been specially designed to support the visual neural network for FSD, and Tesla has developed a full-stack software package inclusive of low-level software and compilers. As a result:
A. Dojo's training efficiency is higher than that of the DGX A100, and a tile composed of 25 D1 chips provides an inference speed 30 times faster than 24 A100s. FSD training time can be reduced from one month to within a week.
B. Compared with Nvidia's A100, Dojo offers a four-fold increase in performance at the same cost, 1.3 times lower power consumption, five times smaller size, and a 4x increase in network training speed and a 3.2x increase in automatic annotation speed.
C.The cost of Dojo is only 1/6 of that of A100. To achieve 100 EFlops, approximately 300,000 A100 chips will be required, costing around $7.5 billion. On the other hand, if 91 Exapods are deployed, the cost would be just around $1.25 billion, resulting in savings of $5 billion.
Unveiling the Secrets of Dojo
Unveiling the Secrets of Dojo
3. Why did Tesla develop Dojo in-house?
Tesla's autonomous driving technology relies mainly on the "camera + sensor" visual technology route, which requires large amounts of data to train the FSD system. The more data needed, the more computing power is required. If fully autonomous driving is to be achieved, even more computing power will be needed to support massive data training. However, Tesla has encountered difficulties with computing power:
A. Unable to purchase. After the AI boom, all tech companies are scrambling to buy Nvidia GPUs. Due to capacity limitations at TSMC, Nvidia's GPU supply is insufficient, making it difficult to meet Tesla's computing power needs.
B. Unable to afford. Due to high demand and supply constraints, the prices of Nvidia GPUs have skyrocketed, increasing procurement costs.
C. Not cost-effective. Compared to Nvidia's general-purpose chips, using a dedicated chip like D1 for FSD video data training would be more efficient and faster. Tesla's current needs are clear, so why spend money on unused functionality?
4. What impact does deploying Dojo have?
A. Cost reduction and increased efficiency, but there is currently no accurate data to verify this.
B. It is conducive to accelerating the development of autonomous driving technology, greatly speeding up Tesla's timeline towards full autonomy, and significantly improving the payment rate and ARPU of software business (such as FSD subscriptions and authorizations) and shared travel businesses (such as Robotaxi).
C. Musk's other companies will benefit from Dojo (such as X, SpaceX, etc.).
D. In the future, Tesla may launch cloud rental services similar to Amazon, opening up and charging Dojo to non-Tesla customers.
E. Other car manufacturers that are authorized to use FSD will naturally need to deploy Dojo to improve training effectiveness. If the model training efficiency of the Dojo system is subsequently verified to be superior to other computing systems including NVDA, Tesla may become the best provider of machine vision training systems in the market.
F. Other scenarios involving complex visual perception tasks can also use Dojo (such as robotics, aviation, security, etc.).
Unveiling the Secrets of Dojo
Disclaimer: Community is offered by Moomoo Technologies Inc. and is for educational purposes only. Read more
6
+0
Translate
Report
38K Views
Comment
Sign in to post a comment