After a nine-year hiatus, Jensen Huang delivered products to Elon Musk again. The AI personal supercomputer that was delayed for more than half a year has finally arrived.
Today, the 11th flight of Starship successfully concluded. Unexpectedly, Jensen Huang also made a surprise appearance at the scene.
It turns out that Jensen Huang flew all the way to the Starbase in Texas. Standing beside the towering Starship, he came to hand over a brand - new "nuclear bomb" to Elon Musk.
This is what everyone has been waiting for since the beginning of the year - NVIDIA DGX Spark Personal AI Supercomputer.
This scene instantly awakened the DNA of all veteran tech fans, taking them back to 2016.
At that time, Elon Musk was still a co - founder of OpenAI and hadn't fallen out with Sam Altman. Jensen Huang personally delivered the world's first DGX - 1 supercomputer to their startup's office.
Jensen Huang joked at that time:
If this were the only product shipped, the project would cost as much as $2 billion.
That "big guy worth $2 billion" later kicked off the entire era of large models.
The following year, Google announced a new neural network training architecture called Transformer.
This new breakthrough was seized by Ilya Sutskever, leading OpenAI to build the first GPT model, all based on NVIDIA's supercomputer.
Nine years have passed. Elon Musk has become a regular on the list of the world's richest people, and Jensen Huang is in charge of a company that once had the highest market value globally.
The DGX delivered by NVIDIA this time is no longer a behemoth but a "performance monster" that can be placed on your desk. Once again, it is announcing in the coolest way that an AI supercomputing era for everyone has begun here.
Spoiler alert: APPSO's DGX Spark is also on the way. We'll bring you more detailed experiences later. Stay tuned.
To be honest, it was really not easy for this DGX Spark to be successfully delivered to Elon Musk.
After making a stunning debut under the name "Project Digits" at CES in January this year, NVIDIA missed the originally scheduled release dates in May and summer and has not shipped the product yet. After waiting for most of the year, many people were worried, and some developers even thought it might be a complete no - show.
Although the official has remained tight - lipped, industry speculations all point to its core - the Grace Blackwell GB10 chip. This chip is like a "Combiner". The Blackwell GPU part (the same architecture as the familiar GeForce RTX 5090 graphics card) was ready long ago, but the Grace CPU part jointly developed with MediaTek has been lagging behind in production, holding back the entire project.
Unexpectedly, the story of "everything is ready except the CPU" happened to NVIDIA.
So, when competitors like the M3 Ultra Mac Studio are attracting attention with their high - memory bandwidth, is this long - awaited DGX Spark, which is also $1000 more expensive than the initial rumor, still worth waiting for?
The answer is: Absolutely! Because its approach is unique and hits the pain points directly.
After waiting for most of the year, what exactly makes the DGX Spark so "appealing"? Let APPSO take you through it.
The soul of the entire machine is the Grace Blackwell GB10 superchip.
It integrates a 20 - core ARM - architecture Grace CPU and a powerful Blackwell GPU into a single superchip.
It can provide up to 1 Petaflop of AI computing performance, allowing you to experience the powerful power of a data center right on your desktop.
Moreover, the DGX Spark has another killer feature. The CPU and GPU are seamlessly connected through NVIDIA NVLink™ - C2C technology, sharing a huge 128 GB unified memory pool.
The bandwidth of this connection technology is five times that of the traditional fifth - generation PCIe, ensuring high - speed data transfer between the CPU and GPU with almost no delay.
Although its memory bandwidth (273 GB/s) is far lower than that of the Mac Studio M3 Ultra (819 GB/s) on paper, NVIDIA's approach is to "achieve miracles with brute force".
In AI tasks, especially when running large models, the huge capacity to fit an entire model into memory at once has far greater strategic value than just the bandwidth number. This means you can smoothly run an ultra - large language model with up to 200 billion parameters directly on your desk without complex model partitioning. There is no other comparable experience.
The Blackwell GPU is equipped with fifth - generation Tensor Cores and supports ultra - low - precision formats like FP4/FP8, with a 5 - fold performance improvement compared to the previous - generation FP8.
This is like turning on a "turbo boost" mode for AI computing, making the inference speed soar while the energy - efficiency ratio is amazing.
Need more power? The DGX Spark also has a built - in NVIDIA ConnectX® - 7 200 Gb/s network interface. You can easily connect two devices to form a mini - cluster with 256 GB of shared memory.
NVIDIA says, such a combination is sufficient to handle a giant model with up to 400 billion parameters, which is beyond the imagination of individual developers.
Beyond the hardware, don't forget NVIDIA's strongest moat - the software ecosystem. The DGX Spark comes pre - installed with a complete NVIDIA AI software stack, including the CUDA library, TensorRT, and various NVIDIA NIM™ microservices. Everything is integrated and tuned for you on the customized DGXOS (based on Ubuntu).
This means you don't have to spend time on compatibility issues and can start using it right out of the box. For developers, the time saved is immeasurable.
This "handover of the century" at the Starbase is just the beginning.
Starting from October 15th, the DGX Spark will be officially available for sale on the NVIDIA official website and through global partners at a price of $3999. Almost all PC giants like Acer, Asus, Dell, and Lenovo have followed up immediately.
To be honest, this price is $1000 more than the initially promised $3000 and is similar to that of the top - spec Mac Studio M3 Ultra. However, their positioning is completely different: the DGX Spark comes pre - installed with DGXOS (based on Ubuntu) and cannot run Windows or macOS. It is a pure "battle machine" for AI developers and hardcore gamers.
Its charm lies in that for $4000, you get the super - ability to tame a 200 - billion - parameter large model locally, along with the full support of the entire CUDA ecosystem. For professionals who need to process sensitive data locally, pursue extreme performance, or want full control over their AI workflow, this price is actually quite competitive.
If you want to know more details about this "one - sided student" with extremely distinct advantages and disadvantages, feel free to tell us in the comment section.
This article is from the WeChat official account "APPSO". Author: Discovering Tomorrow's Products. Republished by 36Kr with permission.