Evaluating the risk of Uncontrolled Artificial General Intelligence
AI Safety Clock
Evaluating the risk of Uncontrolled Artificial General Intelligence
The IMD AI Safety Clock is a tool designed to evaluate the risks of Uncontrolled Artificial General Intelligence (UAGI) – autonomous AI systems that operate without human oversight and could potentially cause significant harm.
Our mission is to evaluate and communicate these risks to the public, policymakers, and business leaders, helping ensure the safe development and use of AI technologies.
Our latest analysis suggests we are 26 minutes from midnight—a symbolic representation of how close we are to a critical tipping point where Uncontrolled Artificial General Intelligence (UAGI) could pose significant threats. This adjustment reflects rapid advancements in agentic AI, intensifying competition in AI hardware, and the growing role of AI in military and geopolitical contexts. Coupled with breakthroughs in AI reasoning models and anticipated shifts in U.S. AI policy prioritizing deregulation, these developments underscore heightened risks. Although no major harm has yet occurred, the accelerating pace of innovation and challenges in governance demand vigilance from all stakeholders to address these escalating concerns.
As AI rapidly advances, the risks increase
The IMD AI Safety Clock takes a systematic approach to evaluating AI’s progress, based on real-time technological and regulatory changes, focusing on:
How advanced AI is in reasoning and problem-solving.
The ability of AI to function independently without human input
AI’s capability to interact with the physical world, including infrastructure, social networks, and even weapons.
The closer we get to midnight, the higher the risk of AI becomes
“As AI advancements push us closer to midnight, effective regulations have the potential to slow down or even reverse the clock’s progress,” explains Michael Wade, TONOMUS Professor of Strategy and Digital and Director of the TONOMUS Global Center for Digital and AI Transformation.
Our methodology
We’ve built a proprietary dashboard that tracks real-time information from over 1,000 websites, 3,470 news feeds, and expert reports. This advanced tool, combined with manual desk research, provides comprehensive and up-to-date insights into AI developments across technology and regulation.
Our methodology blends quantitative metrics with qualitative insights and expert opinions, delivering a multifaceted view of AI risks. By leveraging automated data collection and continuous expert analysis, we ensure a balanced, in-depth understanding of the evolving AI landscape.
Subscribe to stay informed.
The primary goal of the IMD AI Safety Clock is to evaluate the risks posed by Uncontrolled Artificial General Intelligence (UAGI), which refers to autonomous AI systems that operate outside human control and have the potential to cause significant harm to humanity. This is the first release of an ongoing project, and we expect to regularly update our methodology and data. The IMD AI Safety Clock aims to communicate these risks clearly to the public, policymakers, and business leaders, guiding informed decisions to ensure the safe development and use of AI technologies and helping to mitigate potential threats.
There are myriad definitions of AGI, each with its nuances and scope. We define uncontrolled AGI (UAGI) as a system that not only excels in cognitive tasks but also demonstrates a high level of autonomy and adaptability. It can perform complex, unsupervised operations across a wide range of environments. This advanced AGI possesses physical execution capabilities, meaning it can interact with the physical world, significantly enhancing its generality.
We aspire to quantify AGI based on three key attributes: sophistication, autonomy, and execution capability. Sophistication refers to performance, where the system demonstrates high-level reasoning and problem-solving abilities across diverse domains. Autonomy reflects the system’s ability to operate independently, adapting to new situations without human intervention. Finally, execution refers to the ability to perform physical tasks, suggesting that a system’s generality expands as it can effectively interact with the physical world. These dimensions help gauge UAGI’s progress and potential.
The vast majority of AGI predictions are speculative opinions based solely on subjective assessments. For example, Elon Musk estimates a 20-30% chance of an existential AI catastrophe, Geoff Hinton assigns a 10% chance of human extinction within 30 years without strict regulation, Dario Amodei and Yoshua Bengio estimate 10-15% and 10%, respectively, for a civilizational catastrophe, while Ray Kurzweil predicts human-level intelligence by 2029. While many experts provide timelines and probabilities – and we systematically analyze a curated collection of these types of expert opinions – our project emphasizes a more rigorous approach. However, we do not claim to predict uncontrolled AGI. As we say, “Attempting to predict UAGI is like placing a bet on the lottery – only we’re not even sure if a ticket exists.” Our aim is to better understand the landscape rather than make precise forecasts.
The IMD AI Safety Clock evaluates risks by evaluating technological advancements and regulatory measures in three key areas: AI sophistication, autonomy, and execution capabilities. On the technological front, we explore topics such as developments in Large-language models, AI chips, Agentic AI, AI weaponization, robotics, and critical infrastructure. In terms of regulation, we evaluate global and national regulations and policy guidelines and frameworks, alongside the corporate policies of major AI developers including safety frameworks, governance practices, and industry self-regulation. This assessment is supported by a proprietary dashboard that tracks developments across over 1,000 websites and 3,470 news feeds, providing real-time insights.
- Addressing uncertainty in forecasts: While quantitative metrics are central to the clock’s analysis, we acknowledge the inherent uncertainties in forecasting AI risks. The IMD AI Safety Clock incorporates qualitative insights and expert opinions to provide a balanced view. The goal is to communicate potential risks responsibly, not to present definitive predictions, recognizing the challenges associated with such forecasts.
- Focus on balanced communication: The IMD AI Safety Clock aims to inform and stimulate discussions rather than dictate policy. Recognizing the limitations of probabilistic forecasts, the project pairs quantitative measures with continuous qualitative monitoring and expert reviews to ensure a nuanced understanding of AI risks.
- Clarifying the role of quantification: Quantitative measures in the IMD AI Safety Clock serve as one of several tools used to communicate risk. These numbers highlight trends and trigger important discussions, not as definitive predictions. This balanced approach provides context rather than absolute certainty, addressing concerns about over-reliance on speculative probabilities.
Our methodology incorporates both data-driven insights and subjective assessments from independent experts. Extensive research shows that combining assessments from individuals with diverse, relevant expertise improves accuracy, especially when each assessor approaches the task independently. Additionally, because identifying a single assessment method that consistently outperforms others may be challenging or impossible, it is less risky in practice to combine assessments than to rely on an individual assessment method. Moreover, by integrating insights from both human experts and data, we can leverage the strengths of each while minimizing their individual limitations. By emphasizing their complementary advantages, we aim for a more balanced evaluation.
We examine both AI sophistication and autonomy because together they significantly increase the risk from advanced AI systems. AI sophistication reflects the system’s ability to analyze complex data and solve problems, while autonomy refers to its ability to act independently of human control. When these two factors combine, the risks increase – an advanced AI can devise complex strategies and act on them without human oversight.
For example, a highly sophisticated AI could identify vulnerabilities in financial markets, and if it has autonomy, it could exploit them without human knowledge or intervention, potentially causing widespread disruption. This combination makes it critical to assess both aspects when evaluating AI risks.
AI execution is critical because it focuses on the system’s ability to carry out real-world actions based on its decisions. While sophistication reflects how intelligent an AI is, and autonomy its ability to act independently, execution determines how effectively it can implement those decisions. Even a highly sophisticated and autonomous AI poses a limited risk if it cannot execute its plans and connect with the real world. However, when an AI excels in execution, it can cause tangible harm – whether by controlling infrastructure, manipulating markets, or disrupting systems – making it a key factor in assessing overall AI risk.
For example, consider an AI that autonomously controls essential infrastructure systems such as energy grids or transportation networks. If that AI possesses both the sophistication to devise complex strategies and the execution ability to implement them, it could cause large-scale disruptions or even severe damage, such as mass casualties.
LLMs dominate the conversation because they are the most capable general-purpose AI systems accessible today. While there are more specialized AI systems like AlphaFold for protein folding, Siri and Alexa for voice assistance, or Deep Blue and AlphaGo for game strategies, these are designed for narrow, highly specific tasks. In contrast, LLMs can perform a broad range of functions, making them more versatile and applicable to everyday uses, which is why they receive so much attention despite the existence of other AI technologies. Given the rapid pace of AI development, we continuously monitor advancements across the AI technological spectrum, and future versions of the AI clock may dive deeper into additional specialized models as well.
Our initial findings indicate that we are 29 minutes to midnight – a symbolic representation of how close humanity is to a critical point where Uncontrolled Artificial General Intelligence (UAGI) could inflict significant harm. This position suggests that while we have not yet seen significant harm, the rapid advancements in AI technologies, combined with the challenges that regulation faces in keeping pace, place us in a precarious position. Ongoing and future developments should be carefully monitored, as they have the potential to bring us closer to this critical point, requiring vigilance from all stakeholders.
Can you reflect more on what this means?
While much of the public discourse around AI emphasizes its potential to outperform humans across a range of cognitive tasks, significant advances in AI to date have not emerged from general problem-solving approaches. A profound step in this direction is the recently introduced OpenAI o1, a new large language model trained to perform general-purpose complex reasoning. However, most recent breakthroughs focus on specialized tasks requiring human guidance, such as chatbots that use natural language processing to create humanlike conversational dialogue, facial recognition, or object manipulation. While ongoing and future developments should be carefully monitored, current progress remains removed from resembling human intelligence. We are, for instance, in the early – but rapidly advancing – stages of agentic AI, which refers to systems capable of autonomous planning, decision-making, and taking actions without human intervention. Also, we observe increasing deployment of AI in the physical world, such as through robots and autonomous weapons. Meanwhile, the integration of advanced AI into critical infrastructure is still limited, though we see notable developments indicating growing influence. Overall, our research reveals a complex picture of AI’s profound impact, highlighting both opportunities and challenges in deployment and governance.
Multi-modal AI and agentic AI are among the most significant frontier technologies in artificial intelligence. Multi-modal AI, as demonstrated by models like GPT-4o, Gemini Ultra, or Pixtral 12B processes and integrates multiple types of input (such as text, images, audio) to solve more complex tasks. Agentic AI, meanwhile, refers to systems capable of autonomous planning, action, and decision-making, and is advancing rapidly. While these advancements alone are remarkable as they enhance the generality of AI systems, even more attention is needed when they combine with other technologies.
Take, for example, the fusion of AI and robotics. Multi-modal AI models are now enabling physical robots to integrate language understanding with sensor inputs like cameras, microphones, and tactile systems. For instance, a robot could receive a voice command, visually identify an object, and adjust its grip based on tactile feedback. This fusion of sensory input and natural language processing greatly enhances robots’ autonomy and adaptability, allowing them to operate more effectively in real-world environments. Another example is recent research on a series of AI models called Robot Utility Models (RUMs), which exemplify the fusion of robotics and large language models (LLMs) by enabling robots to complete tasks in unfamiliar environments without additional training, significantly improving their flexibility and real-world applicability. Overall, we see a highly active ecosystem at the intersection of robotics and AI, particularly with Generative AI (GenAI) and LLMs. In addition to the many universities, startups, and SMEs driving innovation, incumbents like NVIDIA are developing foundation models for humanoid robots and partnering with robotics companies to create advanced, human-centric robotic systems. It’s also interesting to note that China is accelerating the commercialization of humanoid robots, including deployment in sensitive environments such as power grids and nuclear power plants. These examples highlight the complex landscape of AI, its versatility in combining with other technologies, and the challenges in managing the emerging risks associated with AI’s integration into physical environments.
Although advanced AI is not yet widespread in this field, notable developments indicate its growing influence. Atomic Canyon, for example, is collaborating with Oak Ridge National Laboratory to train an AI model on nuclear data, initially designed to enhance text searches but potentially expanding into critical areas of the nuclear supply chain. Meanwhile, China is fast-tracking humanoid robot commercialization for tasks like power grid and nuclear plant maintenance. At the same time, AI and humanoid robots are rapidly evolving, with generative AI being increasingly integrated into robots to enhance their autonomy, decision-making, and operational capabilities. Such advancements, particularly when viewed in combination, require careful oversight and responsible management for safe deployment. What starts as a simple maintenance tool could become a “Trojan horse,” quietly integrating AI into critical decision-making processes.
The rapid development of AI technologies and the geopolitics of semiconductors are becoming deeply intertwined, with major implications for global power dynamics. A country’s ability to produce advanced AI weapons is increasingly tied to its semiconductor capabilities, as seen with AI-powered weapons deployed in conflict zones like Gaza and Russia’s use of AI anti-drone systems in Ukraine. India and Pakistan’s push for semiconductor production signals how AI-enabled military technology is set to shape modern warfare, especially in regions with a history of conflict. Taiwan, which produces 60% of the world’s semiconductors, is critical to both AI advancements and global supply chains. Any conflict involving Taiwan, particularly with China, could severely disrupt this production and shift the balance of power in AI-driven defense technologies. These developments underscore the need for international efforts to monitor and regulate accordingly.
The IMD AI Safety Clock addresses a global concern – the threat posed by Uncontrolled Artificial General Intelligence (UAGI). To maximise impact and raise awareness, it is crucial to reach a broad audience beyond academic and business circles. By engaging mainstream media, the project aims to drive public discourse, influence policy, and engage business leaders, ensuring the message resonates with a diverse audience.
The IMD AI Safety Clock is designed to raise awareness, not alarm. It provides a sophisticated assessment of the risks of Uncontrolled Artificial General Intelligence (UAGI) to inform and engage the public, policymakers, and business leaders in constructive discussions about AI safety. By clearly communicating these risks, the clock serves as a tool to make complex issues more understandable and actionable, guiding necessary measures to ensure the responsible development of AI.
Regulation is central to the IMD AI Safety Clock’s assessment. It evaluates global and national regulatory measures, policy guidelines, and frameworks alongside corporate policies and initiatives aimed at AI safety to identify areas where regulations successfully mitigate risks and where further efforts are required. By considering both technological progress and regulatory responses, the clock provides a balanced perspective, emphasizing the critical role of robust regulations in managing AI risks.
While we have seen promising regulatory efforts like the EU AI Act, California’s SB 1047, and the recent Framework Convention on Artificial Intelligence by the Council of Europe, regulation is only as effective as its implementation. These frameworks are important steps, but there is still a need for greater clarity and practical enforcement.
For example, while keeping a human ‘in the loop’ is a widely recommended strategy for mitigating risks from autonomous AI systems, and it is legally required in some regulations, like under Article 14 of the EU AI Act, this approach is not foolproof. Human oversight can be compromised by human error, bias, or even over-reliance on the AI itself, which may exacerbate rather than reduce risks. Take the idea of having a ‘human in the loop’, which raises practical questions such as: “Which human?” and “What kind of role does this human play in the process, and in which part of the process?” History has shown that simply having human involvement does not always prevent catastrophe. More detailed regulatory work is needed to address these challenges and ensure that regulation truly mitigates the risks from AI.
No, regulatory efforts alone are not enough. While initiatives like the EU AI Act, California’s SB 1047, and the Council of Europe’s Framework Convention on AI are crucial, all stakeholders, especially companies developing foundation models like OpenAI, Meta, and Alphabet, play an equally critical role in mitigating AI risks. That is why we closely track and analyze the safety policies and frameworks these companies put in place.
For instance, Anthropic’s Responsible Scaling Policy (RSP) directly addresses catastrophic risks by setting safety levels and operational standards, to translate broad safety concepts into practical guidelines. Similarly, other AI companies, such as OpenAI with its Preparedness Framework and Google DeepMind with its Frontier Safety Framework, have implemented comparable safety initiatives. While these efforts help establish industry standards, there remains a need for greater clarity and consistency across the sector. Ultimately, both legal frameworks and corporate practices must align to create a comprehensive safety net.
While it is true that guidelines and frameworks are not compulsory for firms developing AI, they hold significant importance because they often shape future regulatory standards. As seen in California’s SB 1047, the National Institute of Standards and Technology (NIST) guidelines are referenced directly in legislation, turning risk management best practices into legal requirements to prevent unintended consequences of AI technologies. By tracking these guidelines in our database, we anticipate their potential impact. However, we assess them differently from formal regulations, as they are precursors that can evolve into enforceable rules. Thus, they remain essential for forward-looking risk management.
This reflects historical patterns where regulation often struggles to keep pace with emerging technologies. As AI technology evolves, regulators face the challenge of keeping up with the pace of innovation. For instance, the EU AI Act uses floating point operations per second (FLOPs) as a risk measure, flagging general purpose AI (GPAI) as ‘high-risk’ when their training exceeds 10²⁵ FLOPs. However, techniques like quantization, which reduces the precision of computations to integers, allow developers to reduce compute needs without significantly sacrificing performance. More recent regulatory efforts, like California’s SB 1047, aim to address this by tracking both FLOPs and integer operations, attempting to stay aligned with technological advances. Nevertheless, the race between innovation and governance remains constant as AI capabilities evolve rapidly. However, policymaking is slow, taking years to debate, enact, and function effectively. Beyond regulation, this also involves international treaties, requiring a global solution for which all stakeholders must act quickly.
The IMD AI Safety Clock draws inspiration from the Doomsday Clock to symbolize the urgency of the risks to humanity. The Doomsday Clock has been effective in raising awareness about nuclear risks, and similarly, the IMD AI Safety Clock aims to communicate the potential dangers of AI.
The next steps in our research involve expanding the scope of topics we explore and further deepening our analysis through expert collaboration. While we already draw on expert insights from diverse individuals, we plan to further enhance our understanding by consulting additional specialists in AI and governance, as well as engaging with other research institutions for additional expertise. Given that few, if any, true experts exist in the AGI risk space, an interdisciplinary, collaborative approach is essential. We are open to feedback from anyone willing to contribute, and in many ways, we’ll be learning about these challenges together. This is just the first version of the AI Safety Clock, and we plan to update it regularly or as developments necessitate.
All views expressed herein are those of the author and have been specifically developed and published in accordance with the principles of academic freedom. As such, such views are not necessarily held or endorsed by TONOMUS or its affiliates.
Explore our news and research on the topic of artificial intelligence.
Digital Transformation and AI programs
IMD’s rich portfolio of digital transformation and AI programs equips you with the right skills, useful frameworks, exclusive research and case studies to understand data analytics, harness the power of AI and rethink your strategy through the lens of digital.
Digital Transformation and AI programs
IMD’s rich portfolio of digital transformation and AI programs equips you with the right skills, useful frameworks, exclusive research and case studies to understand data analytics, harness the power of AI and rethink your strategy through the lens of digital.
If you have any questions about the AI Safety Clock or would like to connect with the team behind the research, fill out the form and we’ll come back to you at the earliest opportunity.