Breaking Down Barriers to AI Innovation with Reid Hoffman & Kevin Scott

We could soon see generative AI systems capable of passing Ph.D. exams thanks to more “durable” memory and more robust reasoning operations, Microsoft CTO Kevin Scott revealed when he took to the stage with Reid Hoffman during a Berggruen Salon in Los Angeles earlier this week.

Recommended Videos

“It’s sort of weird right now that you have these interactions with agents and the memory is entirely episodic,” he lamented. “You have a transaction, you do a thing. It’s useful or not for whatever task you were doing, and then it forgets all about it.” The AI system isn’t learning from or even remembering previous interactions with the user, he continued. “There’s no way for you to refer back to a thing you were trying to get [the AI] to solve in the past.”

However, Scott is optimistic that,”we’re seeing technically all of the things fall in place to have really durable memories with the systems.” With more persistent memory, future AI systems will be able to respond more naturally and more accurately over the span of multiple conversations rather than being limited to the current session.

OpenAI announced in February that it was beginning to test a new persistent memory system, rolling it out to select free and Plus subscription users. Enabling the feature allows the AI to recall user tone, voice, and format preferences between conversations as well as make suggestions in new projects based on details the user mentioned in previous chats.

Scott was also buoyant about improving the “fragility” found in the reasoning of many AI systems today. “It can’t solve very complicated math problems,” he explained. “It has to bail out to other systems to do very complicated things.”

“Reasoning, I think, gets a lot better,” he continued. He compares GPT-4 and the current generation of models to high schoolers passing their AP exams. However, the next generation of AIs “could be the thing that could pass your qualified exam.”

To date, generative AI systems have outperformed their flesh-and-blood counterparts on a variety of exam and task formats. Last November, for example, GPT-4 passed the Multistate Professional Responsibility Exam (MPRE), better known as the bar exam, with 76% correct — that’s six points higher than the nation average for humans.

Scott was quick to point out, however, that training generative AIs to pass Ph.D. exams “probably sounds like a bigger deal than it actually is… the real test will be what we choose to do with it.”

Scott was especially excited to see the barriers to entry falling away so quickly. He noted that when he got into machine learning two decades ago, his work required graduate-level knowledge, stacks upon stacks of “very daunting, complicated, technical papers to figure out how to do what I wanted to do,” and around six months of coding. That same task today, he said, “a high school student could do in a Saturday morning.”

These lowered barriers to entry will likely accelerate the democratization of AI, Scott concluded. Finding solutions to the myriad social, environmental, and technological crises facing humanity are not — and cannot — be the sole responsibility of “just the people at tech companies in Silicon Valley or just people who graduated with Ph.D.s from top-five universities,” he said. “We have 8 billion people in the world who also have some idea about what it is that they want to do with powerful tools, if they just have access to them.”

Editors’ Recommendations