“The age of A.I. has begun,” Nvidia CEO Jensen Huang declared at this year’s GTC. At its GPU Technology Conference this year, Nvidia showcased its innovation to further A.I., noting how the technology could help solve the world’s problems 10 times better and faster.
While Nvidia is most well-known for its graphics cards — and more recently associated with real-time ray tracing — the company is also driving the behind-the-scenes innovation that brings artificial intelligence into our daily lives, from warehouse robots that pack our shipping orders, to self-driving cars and natural language bots that deliver news, search, and information with little latency or delay.
“We love working on extremely hard computing problems that have great impact on the world,” Huang said, noting that the company now has 110 SDKs that target the more than 1 billion CUDA-compatible GPUs that have been shipped. The company says more than 6,500 startups are building applications on Nvidia, joining the 2 million total Nvidia developers. “This is right in our wheelhouse. We’re all in to advance and democratize this new form of computing for the age of A.I. Nvidia is dedicated to advancing accelerated computing.”
Another apology for the RTX 3080/3090 launch
Huang led with another quick apology about the difficult launch of the Nvidia RTX 3080 and 3090 video cards. Read more here.
Nvidia Omniverse is a training ground for robots
For gamers, ray tracing helps to render vivid details in scenes in video games by using the property of light. Nvidia is using the same principles to build Nvidia Omniverse, which the company claimed is “a place where robots can learn how to be robots, just like they would in the real world.”
Available now in open beta, Nvidia Omniverse is an open platform for collaboration and simulation where robots can learn from realistic simulations of the real world. Using Omniverse, autonomous vehicles can quickly learn to drive and interact with scenarios that real human drivers may encounter, without the risk of endangering bystanders if the experiment goes sideways. Omniverse also allows for testing on a much wider scale, since an autonomous vehicle or robot doesn’t have to be physically deployed to test it.
To show how Nvidia Omniverse can affect us all, Nvidia highlighted how Omniverse can work in drug discovery, which is an even more vital area of research given the global pandemic. Though drug discovery typically takes more than a decade to develop adug and requires more than a half-billion dollars in research and development funding, 90% of those efforts fail, Huang said. To make matters worse, every nine years, the cost of discovering new drugs doubles.
Nvidia’s Omniverse can help scientists identify the proteins that can cause disease, as well as speed up testing of potential medications by using A.I. and data analytics. All this is applied to Nvidia’s new Clara Discovery platform. And in the U.K., Nvidia introduced its new Cambridge One data center, which the company says is the fastest in the region and one of the top 30 in the world, with 400 petaflops of A.I. performance.
The company also introduced its new DGX Super Pod architecture, to allow other researchers to build their own scalable supercomputers that link between 20 to 140 DGX systems.
Nvidia RTX A6000: Ray tracing for professionals
Expanding on the recently announced GeForce RTX 3070, RTX 3080, and RTX 3090 graphics cards, Nvidia announced a new generation of Ampere-based GPUs for professionals. The new graphics cards aren’t branded under Nvidia’s Quadro umbrella, but the RTX A6000 and Nvidia A40 GPUs are targeted at the same creative and data scientist audiences who purchase the Quadro GPUs.
“The GPUs provide the speed and performance to enable engineers to develop innovative products, designers to create state-of-the-art buildings, and scientists to discover breakthroughs from anywhere in the world,” the company stated in a blog post, noting that the new A6000 and A40 feature new RT cores, Tensor cores, and CUDA cores that are “significantly faster than the previous generations.”
The company did not provide specific details about the hardware. However, Nvidia claimed that the second-generation RT cores deliver 2x the throughput of the prior-generation cards while also providing concurrent ray tracing, shading, and compute capabilities, while the third-generation Tensor cores provide up to 5x the throughput of the previous generation.
The cards ship with 48GB of GPU memory that is expandable to 96GB with NVLink when two GPUs are connected. This compares to just 24GB of memory on the RTX 3090. Whereas the RTX 3090 is marketed as a GPU that is capable of rendering games in 8K at 60 frames per second (fps), the expanded memory on the professional RTX A6000 and A40 helps process Blackmagic RAW 8K and 12K footage for video editing. Like the consumer Ampere cards, the A6000 and A40 GPUs are based on PCIe Gen 4, which delivers twice the bandwidth of the prior generation.
The A40-based servers will be available in systems from Cisco, Dell, Fujitsu, Hewlett Packard Enterprise, and Lenovo. The A6000 GPUs will be coming to channel partners, and both GPUs will be available early next year. Pricing details were not immediately available, and it’s unclear if the professional cards will see the same limited supply and major shortages that Nvidia experienced with the launch of its consumer cards.
The rise of the A.I. bots
Nvidia also highlighted how its work on GPUs is helping to speed up A.I. development and adoption. Facebook’s A.I. researchers have developed a chatbot with knowledge and empathy that half of the social network’s users actually preferred. California Institute of Technology researchers trained a drone using reinforcement learning to control the flight system to fly smoothly through turbulence and changes in terrain.
Nvidia’s A.I. is based on three pillars: Single- to multi-GPU nodes on any framework or model, the use of inference, and applying pretrained models, Huang said.
Nvidia also announced that it has partnered with Microsoft to bring Nvidia A.I. to Azure to help make Office smarter.
“Today, we’re announcing that Microsoft is adopting Nvidia A.I. on Azure, to power smart experiences in Microsoft Office,” Huang said during the keynote. “The world’s most popular productivity application used by hundreds of millions will now be A.I.-assisted. The first features will include smart grammar correction, Q&A, text prediction. Because of the volume of users and the instant response needed for a good experience, Office will be connected to Nvidia GPUs, and Azure with Nvidia GPUs responses take less than 200 milliseconds. Our throughput lets Microsoft scale to millions of simultaneous users.”
American Express is also using A.I. to combat fraud, while Twitter is leveraging artificial intelligence to help it understand and contextualize the vast amount of videos uploaded to the platform.
With conversational A.I., results from voice queries performed on Nvidia’s GPU platform have half the latency compared to CPU-processed queries and also more realistic, human-like sounding text-to-speech engines. Nvidia also announced an open beta of Jarvis for developers to try A.I. with conversational skills.
A.I. for the work-from-home future
A.I. can also be built into applications like videoconferencing and chatting solutions that help workers collaborate remotely. With Nvidia’s Video Maxene, Huang said that A.I. can do magic for video calls.
Maxene can identify the important features of a face, send only the changes of the features over the internet, and then reanimate the face at the receiver. This saves bandwidth, making for a better video experience in areas with poor Internet connectivity. Huang claimed that bandwidth is reduced by a factor of 10.
A.I. make calls better even in areas with high bandwidth, however. In the most extreme example, A.I. can be used to reorient your face so that you’re making eye contact with every person on a call, even when your face is tilted slightly away from the camera. A.I. can also reduce background noise, relight your face, replace the background, and improve video quality in poor lighting. Combined with Jarvis A.I. speech, Maxene can also deliver closed caption text.
“We have an opportunity to revolutionize videoconferencing of today, and invent the virtual presence of tomorrow,” Huang said. “And video A.I. inference applications are coming from every industry.”
Bringing a data center to an ARM chip
Highlighting its investment in ARM chips, Nvidia announced the new BlueField DPUs, which bring the power of an a data-center-infrastructure-on-a-chip and are supported by DOCA, which is the architecture.
The new BlueField 2 DPUs offload critical components — like networking and storage — as well as security tasks from the CPU to help prevent cyberattacks.
“A single BlueField-2 DPU can deliver the same data center services that could consume up to 125 CPU cores,” Nvidia claimed in a prepared statement. “This frees up valuable CPU cores to run a wide range of other enterprise applications.” The company said that at least 30% of the CPU was previously consumed by running data center infrastructure, and those cores are now freed up as the task is now offloaded to the BlueField DPU.
A second Bluefield 2X DPU also comes with Nvidia’s Ampere-based GPU technology. Ampere brings A.I. to the BlueField 2X to provide real-time security analytics and identify malicious activity.
Personalized recommendation engines
A.I. can be used to deliver personalized recommendations of digital and physical goods on platforms, serving relevant digital ads, news, and movies. Nvidia claimed that even a 1% improvement in recommendation accuracy can amount to billions more in sales and higher customer retention.
To help companies improve their recommendation engine, Nvidia introduced Merlin, which is powered by the Nvidia Rapids platform. Whereas CPU-based solutions can take days to learn, Merlin is said to be superfast and super-scalable, with cycle times going from a day to just three hours. Merlin is now in open beta, Huang said.
Rapids is used by Adobe for intelligent marketing, while Capital One is using the platform for fraud analytics and to power the company’s Eno chatbot.
A.I. for all the IoT
Nvidia’s EGX platform is used to bring A.I. to edge devices to make A.I. more responsive for Internet of Things, or IoT, applications. EGX is available on Nvidia’s NGC, and it is used by hospitals like Northwestern Memorial Hospital to offload some tasks to computers that are routinely performed by nurses. Patients, for example, can use natural language queries to ask a bot what procedure they are having.
“The EGX A.I. computer integrates a Mellanox Bluefield 2 GPU and an Ampere GPU into a single PCI Express card, turning any standard OEM server into a secure accelerated A.I. data center,” Huang said.
The platform can be leveraged in health care, manufacturing, logistics, delivery, retail, and transportation.
Advancing ARM
“Today. we’re announcing a major initiative to advance the ARM platform,” Huang said of the company’s announced acquisition of ARM, nothing that it is making investments in three dimensions.
“First, we’re complementing ARM partners with GPU, networking, storage, and security technologies to create complete accelerated platforms. Second, we’re working with our partners to create platforms for HPC cloud edge NPC. This requires chips systems and system software. And third, we are porting the Nvidia A.I. and Nvidia RTX engines to ARM.”
Currently, this is only available on the x86 platform. However, Nvidia’s investment in ARm will transform it in the leading edge and accelerate it in A.I. computing, Huang said, as he looks to position ARM as a competitor to Intel in the server space.