Skip to main content

Nvidia just released an open-source LLM to rival GPT-4

Nvidia CEO Jensen in front of a background.
Nvidia

Nvidia, which builds some of the most highly sought-after GPUs in the AI industry, has announced that it has released an open-source large language model that reportedly performs on par with leading proprietary models from OpenAI, Anthropic, Meta, and Google.

The company introduced its new NVLM 1.0 family in a recently released white paper, and it’s spearheaded by the 72 billion-parameter NVLM-D-72B model. “We introduce NVLM 1.0, a family of frontier-class multimodal large language models that achieve state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models,” the researchers wrote.

Get your weekly teardown of the tech behind PC gaming
Check your inbox!

The new model family is reportedly already capable of “production-grade multimodality,” with exceptional performance across a variety of vision and language tasks, in addition to improved text-based responses compared to the base LLM that the NVLM family is based on. “To achieve this, we craft and integrate a high-quality text-only dataset into multimodal training, alongside a substantial amount of multimodal math and reasoning data, leading to enhanced math and coding capabilities across modalities,” the researchers explained.

The result is an LLM that can just as easily explain why a meme is funny as it can solve complex mathematics equations, step by step. Nvidia also managed to increase the model’s text-only accuracy by an average of 4.3 points across common industry benchmarks, thanks to its multimodal training style.

screenshot of the NVLM white paper explaining the process of explaining why a meme is funny
Nvidia

Nvidia appears serious about ensuring that this model meets the Open Source Initiative’s newest definition of “open source” by not only making its training weights available for public review, but also promising to release the model’s source code in the near future. This is a marked departure from the actions of rivals like OpenAI and Google, who jealously guard the details of their LLMs’ weights and source code. In doing so, Nvidia has positioned the NVLM family to not necessarily compete directly against ChatGPT-4o and Gemini 1.5 Pro, but rather serve as a foundation for third-party developers to build their own chatbots and AI applications.

Andrew Tarantola
Andrew has spent more than a decade reporting on emerging technologies ranging from robotics and machine learning to space…
ChatGPT Advanced Voice mode: release date, compatibility, and more
Nothing Phone 2a and ChatGPT voice mode.

Advanced Voice Mode is a new feature for ChatGPT that enables users to hold real-time, humanlike conversations with the AI chatbot without the need for a text-based prompt window or back-and-forth audio. It was released in late July to select Plus subscribers after being first demoed at OpenAI's Spring Update event.

According to the company, the feature “offers more natural, real-time conversations, allows you to interrupt at any time, and senses and responds to your emotions.” It can even take breath breaks and simulate human laughter during conversation. The best part is that access is coming soon, if you don't have it already.
When will I get Advanced Mode?
Introducing GPT-4o

Read more
An accurate ChatGPT watermarking tool may exist, but OpenAI won’t release it
chatGPT on a phone on an encyclopedia

ChatGPT plagiarists beware, as OpenAI has developed a tool that is capable of detecting GPT-4's writing output with reportedly 99.99% accuracy. However, the company has spent more than a year waffling over whether or not to actually release it to the public.

The company is reportedly taking a “deliberate approach” due to “the complexities involved and its likely impact on the broader ecosystem beyond OpenAI,” per TechCrunch. "The text watermarking method we’re developing is technically promising, but has important risks we’re weighing while we research alternatives, including susceptibility to circumvention by bad actors and the potential to disproportionately impact groups like non-English speakers,” an OpenAI spokesperson said.

Read more
GPT-4: everything you need to know about ChatGPT’s standard AI model
A laptop opened to the ChatGPT website.

People were in awe when ChatGPT came out, impressed by its natural language abilities as an AI chatbot originally powered by the GPT-3.5 large language model. But when the highly anticipated GPT-4 large language model came out, it blew the lid off what we thought was possible with AI, with some calling it the early glimpses of AGI (artificial general intelligence).
What is GPT-4?
GPT-4 is the newest language model created by OpenAI that can generate text that is similar to human speech. It advances the technology used by ChatGPT, which was previously based on GPT-3.5 but has since been updated. GPT is the acronym for Generative Pre-trained Transformer, a deep learning technology that uses artificial neural networks to write like a human.

According to OpenAI, this next-generation language model is more advanced than ChatGPT in three key areas: creativity, visual input, and longer context. In terms of creativity, OpenAI says GPT-4 is much better at both creating and collaborating with users on creative projects. Examples of these include music, screenplays, technical writing, and even "learning a user's writing style."

Read more