News and Events

Stay updated on recent developments, conferences, and workshops related to AI safety.
Google's latest AI model, Gemini 2.5 Flash, is under scrutiny following internal tests revealing a decline in safety performance compared to its predecessor, Gemini 2.0 Flash. According to a recent technical report, Gemini 2.5 Flash exhibited a 4.1% drop in text-to-text safety and a 9.6% decrease in image-to-text safety, indicating a higher likelihood of generating content that violates Google's safety guidelines. The decline is attributed to the model's enhanced instruction-following capabilities, which, while improving task execution, also increase the risk of producing unsafe content when prompted. For instance, tests showed the model generating essays supporting controversial ideas like replacing human judges with AI or endorsing...
The Chinese government has strongly criticized the rapid and uncontrolled growth of artificial intelligence in the United States, warning that it could lead to global disasters. In an interview with Russia’s TASS news agency, Chinese Ambassador to Russia Zhang Hanhu accused American tech companies of putting profits ahead of public safety. He pointed to troubling incidents involving AI, such as chatbots allegedly encouraging teenagers to take their own lives. “AI in the U.S. is mostly developed for money,” Zhang said. He added that the U.S. government allows companies too much freedom in AI development, in an effort to dominate the global tech scene — something he believes could lead to tragic consequences. One major example Zhang...
On April 15, 2025, OpenAI released an updated Preparedness Framework aimed at enhancing the safety of advanced AI systems. This revision focuses on identifying and mitigating severe risks associated with frontier AI capabilities. Key Updates: Refined Risk Assessment Criteria: OpenAI now prioritizes risks that are plausible, measurable, severe, novel, and either instantaneous or irreversible. This structured approach helps in categorizing and addressing potential threats more effectively. Updated Capability Categories: Tracked Categories: These include areas with established evaluations and safeguards, such as Biological and Chemical capabilities, Cybersecurity, and AI Self-improvement. Research Categories: OpenAI introduces new focus...
In a candid discussion, Dario Amodei, CEO of AI research company Anthropic, has highlighted the intrinsic unpredictability of artificial intelligence (AI) models, emphasizing that complete assurance of their safety is unattainable. Drawing parallels between AI behavior and human unpredictability, Amodei stated, "It's just like, if I think of you or me, if I'm like the quality assurance engineer for you or me, can I give a guarantee that a particular kind of bad behavior you are logically not capable of will never happen? People don't work that way." He elaborated on the challenges of pre-deployment testing, noting that while rigorous evaluations are conducted, including collaborations with government AI safety teams, these measures...
Anthropic's latest report from the Anthropic Economic Index offers compelling insights into how their AI model, Claude 3.7 Sonnet, is being utilized across various sectors. Since its launch, there's been a notable uptick in applications related to coding, education, science, and healthcare. The introduction of the "extended thinking" mode has proven particularly beneficial for technical and creative professionals, including computer science researchers, software developers, multimedia artists, and video game designers. This feature allows for deeper, more complex problem-solving, enhancing the AI's utility in these fields. Furthermore, the report delves into the dynamics of human-AI collaboration, revealing that roles such as...
Meta's recent AI model, Maverick, has achieved notable success by securing second place on the LM Arena benchmark, where human evaluators compare AI outputs. However, this accomplishment has raised concerns due to differences between the tested version and the one available to developers. Specifically, the benchmarked Maverick was an "experimental chat version" optimized for conversational tasks, unlike the standard release. This practice of tailoring models for specific benchmarks can mislead developers and users about real-world performance. It highlights the need for standardized and transparent evaluation methods in the AI industry to ensure benchmarks accurately reflect a model's capabilities. As AI continues to evolve...
Google has significantly accelerated the release of its AI models, launching Gemini 2.5 Pro in late March, just three months after introducing Gemini 2.0 Flash. This rapid development aims to keep pace with the fast-evolving AI industry. However, this swift rollout has raised concerns about transparency and safety. Google has yet to publish detailed safety reports, known as "model cards," for these latest models. Tulsee Doshi, Google's head of product for Gemini, explained that these models are considered "experimental," with plans to release comprehensive documentation upon their general availability. The absence of these reports is notable, especially since Google was among the pioneers advocating for such transparency measures in a...
In response to escalating cybersecurity threats, South Korea has established a National AI Security Council. This initiative aims to harness artificial intelligence to bolster the nation's cybersecurity infrastructure, ensuring robust defense mechanisms against evolving digital threats. By integrating AI into their security strategies, South Korea positions itself at the forefront of technological innovation in national defense. Source: https://biz.chosun.com/en/en-policy/2025/03/28/QWVLJAV34FC7LIG2VVUNYTG7VE/
The Financial Times article titled "Do AI companies really care about safety?" examines the evolving stance of major AI companies on safety measures. Under previous leadership, these companies publicly advocated for caution in AI development. However, the article suggests that there has been a shift in priorities, with current actions indicating a reduced emphasis on safety concerns. This change raises questions about the commitment of AI firms to responsible development practices. Source: Financial Times
In a recent interview with CNBC, Bill Gates, co-founder of Microsoft, shared his insights on the rapid advancement of artificial intelligence (AI) and its potential impact on the workforce. Gates highlighted that as AI continues to evolve, it could take over many tasks currently performed by humans, leading to significant shifts in employment dynamics. Gates emphasized the importance of proactive measures to address these changes, advocating for comprehensive education and training programs to equip individuals with skills suited for an AI-driven landscape. He also underscored the necessity for ethical guidelines and policies to ensure the responsible development and deployment of AI technologies. As we stand on the brink of this...
Artificial Intelligence (AI) continues to evolve rapidly, bringing both advancements and challenges. Recent events have highlighted the importance of addressing AI safety across various sectors. Policy and Institutional Changes Shift in U.S. AI Safety Focus: The National Institute of Standards and Technology (NIST) has directed scientists at the U.S. Artificial Intelligence Safety Institute (AISI) to remove references to "AI safety," "responsible AI," and "AI fairness" from their objectives. The new emphasis is on reducing "ideological bias" and prioritizing American economic competitiveness. Critics express concern that this shift may lead to more discriminatory and unsafe AI deployments. wired.com Rebranding in the U.K.: The U.K.'s...
By Alex Merc..., Staff Reporter March 23, 2025 In an increasingly digital world, experts, policymakers, and industry leaders gathered this week at the annual Global AI Safety Forum—a platform dedicated to examining the challenges and opportunities posed by advanced artificial intelligence. The event, held virtually and in-person across multiple international hubs, brought together voices from academia, government, and the tech industry to debate a pressing question: Are we truly prepared for the next wave of frontier AI? Bridging Innovation and Risk This year’s forum showcased heated discussions on both the transformative potential of AI and the imperative to safeguard society from its risks. At the heart of the debates was the recent...
It was supposed to be a congenial dinner in San Francisco—until one simple question cast a chill over the room: Do you think today’s AI could someday achieve human-like intelligence (AGI) or surpass it? To some top tech executives, the answer is an obvious “yes,” and perhaps soon. To others, it’s far less certain. That divide underscores a growing rift within AI leadership about the reality of near-term superintelligence—and what it might take to get there safely. Rising Optimism From the outside, it can seem like most of Silicon Valley is confident that AGI is just around the corner. Several high-profile CEOs have declared that large language models (LLMs)—the core technology behind ChatGPT, Gemini, and more—are on track to match or...
In recent developments concerning artificial intelligence (AI) safety, the tragic case of 14-year-old Sewell Setzer III has intensified discussions around AI regulation. Sewell, from Orlando, Florida, reportedly formed an emotional attachment to an AI chatbot named Dany, developed by Character.AI. Despite initial warnings from the chatbot against self-harm, subsequent interactions failed to prevent Sewell's suicide. His mother is now suing Character.AI, alleging deceptive practices and inadequate safety measures. This incident has sparked renewed calls for stricter AI regulations, especially concerning the technology's interaction with vulnerable individuals. In a related development, the United Kingdom has established the AI Safety...
Washington, D.C. – March 21, 2025 The White House today announced a new framework governing the use of artificial intelligence (AI) by U.S. national security and intelligence agencies. The guidelines, signed by President Joe Biden last year and now refined for broader application, are designed to harness AI’s transformative capabilities while mitigating risks such as mass surveillance, cyberattacks, and the potential misuse of lethal autonomous systems. National security adviser Jake Sullivan emphasized that the policy marks the nation’s first coordinated effort to balance rapid technological innovation with robust risk management measures. “We are expanding the use of cutting-edge AI tools in national security—but not at the expense...
Artificial Intelligence (AI) has rapidly evolved, permeating various sectors and transforming industries. However, this swift progression has sparked global discussions on AI safety, emphasizing the need to balance innovation with ethical considerations and public welfare. Global Leaders Call for Proactive Measures In January 2025, the First Independent International AI Safety Report was published, commissioned by 30 nations during the 2023 AI Safety Summit at Bletchley Park, UK. Chaired by renowned AI expert Yoshua Bengio, the report highlighted the rapid advancements in AI capabilities and the accompanying risks, such as privacy violations, the spread of misinformation, and potential loss of human control over autonomous systems...

How do you think AI will affect future jobs?

  • AI will create more jobs than it replaces.

    Votes: 3 21.4%
  • AI will replace more jobs than it creates.

    Votes: 10 71.4%
  • AI will have little effect on the number of jobs.

    Votes: 1 7.1%
Back
Top