OpenAI’s Superalignment Team Focuses on AI Governance Amid Leadership Shake-Up

Amidst the fallout of Sam Altman’s abrupt departure from OpenAI and the subsequent chaos, OpenAI’s Superalignment team remains steadfast in their mission to tackle the challenges of controlling AI that surpass human intelligence. While the leadership turmoil unfolds, the team, led by Ilya Sutskever, is actively working on strategies to steer and regulate superintelligent AI systems.

This week, members of the Superalignment team, including Collin Burns, Pavel Izmailov, and Leopold Aschenbrenner, presented their latest work at NeurIPS, the annual machine learning conference in New Orleans. Their primary goal is to ensure that AI systems behave as intended, especially as they venture into the realm of superintelligence.

The Superalignment initiative, launched in July, is part of OpenAI’s broader efforts to govern AI systems with intelligence surpassing that of humans. Collin Burns acknowledged the difficulty in aligning models smarter than humans, posing a significant challenge for the research community.


A figure illustrating the Superalignment team’s AI-based analogy for aligning superintelligent systems.

Despite the recent leadership changes, Ilya Sutskever continues to lead the Superalignment team, raising questions given his involvement in Altman’s ouster. The Superalignment concept has sparked debates within the AI research community, with some questioning its timing and others considering it a distraction from more immediate regulatory concerns.

While Altman drew comparisons between OpenAI and the Manhattan Project, emphasizing the need to protect against catastrophic risks, skepticism remains about the imminent development of superintelligent AI systems with world-ending capabilities. Critics argue that focusing on such concerns diverts attention from pressing issues like algorithmic bias and the toxicity of AI.

The Superalignment team is actively developing governance and control frameworks for potential superintelligent AI systems. Their approach involves using a less sophisticated AI model to guide a more advanced one, akin to a human supervisor guiding a superintelligent AI system.

In a surprising move, OpenAI announced a $10 million grant program to support technical research on superintelligent alignment. The funding, including a contribution from former Google CEO Eric Schmidt, is aimed at encouraging research from academic labs, nonprofits, individual researchers, and graduate students. The move has prompted speculation about Schmidt’s commercial interests in AI.

Despite concerns, the Superalignment team assures that their research, along with the work supported by grants, will be shared publicly, adhering to OpenAI’s mission of contributing to the safety of AI models for the benefit of humanity. The team remains committed to addressing one of the most critical technical challenges of our time: aligning superhuman AI systems to ensure their safety and benefit for all.