OpenAI's chief scientist says AI is getting close to being as good as a human research intern

9 hours ago 7

The OpenAI logo is displayed on a cellphone with an image on a computer monitor generated by ChatGPT's Dall-E text-to-image model in 2023 in Boston

OpenAI's chief scientist, Jakub Pachocki, says AI is advancing toward systems that could work like research interns. AP Photo/Michael Dwyer, File

An OpenAI exec said it's getting closer to building AI systems that can work as well as research interns.
Chief Scientist Jakub Pachocki said he "definitely" sees the company "on track" to hit that goal.
OpenAI is aiming for an "AI research intern" by 2026 and a fully autonomous one by 2028.

OpenAI is getting closer to one of its milestone goals: systems that can function at the same level as research interns.

On an episode of the "Unsupervised Learning" podcast on Thursday, OpenAI's chief scientist Jakub Pachocki said recent breakthroughs in coding, advances in math research capabilities, and progress in physics suggest AI is on track to handle increasingly complex, multi-step technical work with less human oversight.

"I definitely see this as a signal that something here is on track," he said.

He said the key measure is how long a model can work mostly autonomously.

"The way I would distinguish a research intern from a full automated researcher is the span of time that we would have it work mostly autonomously," Pachocki said, pointing to longer task horizons as the key metric of progress.

At a company livestream in October, Pachocki outlined OpenAI's internal goal of building an "AI research intern" by September 2026, followed by a fully autonomous AI researcher by March 2028.

In an X post following the livestream, OpenAI CEO Sam Altman said OpenAI "may totally fail" at the goal, but that it was important to be transparent given the potential impact.

'Explosive growth of coding tools'

Pachocki said the company is already seeing fast progress in the kinds of tasks that matter for that goal, pointing to coding agents like Codex, which he said are now handling much of the company's programming work.

He also pointed to math benchmarks as a "north star" for improving model reasoning, since they are easy to verify.

"We've seen this explosive growth of coding tools," Pachocki said. "For most people, the act of programming has changed quite a bit."

He added that the near-term challenge is moving toward systems that can tackle specific technical tasks with more autonomy, use more compute, and work for longer stretches of time.

"For more specific technical ideas, like I have this particular idea how to improve the models, how to run this evaluation differently, I think we have the pieces that we mostly just need to put together," he said.

Still, Pachocki was clear that AI isn't ready to operate independently at the level of a full researcher.

"I don't expect we'll have systems where you just tell them, 'go improve your model capability, go solve alignment,' and they will do it, not this year," he said.

Read Entire Article