12 Days of OpenAI

12 Days of OpenAI: Day 12 – OpenAI announced early evals for OpenAI o3 & o3-mini – o3 and o3-mini have been announced for early evaluations, with o3-mini expected publicly by end of January 2025 and o3 shortly after, currently limited to public safety testing for approved researchers – These models achieve exceptional results on benchmarks like SWE-bench Verified, Codeforces, AIME 2024, and GPQA Diamond, setting new records on ARC-AGI – A new benchmark, developed through a partnership between arcprize[.]org and OpenAI, will launch in 2025 to broaden model evaluation frameworks – Early access API applications are open to safety researchers until January 10, 2025, under strict terms that prohibit distillation, IP extraction, and training other models with outputs – Deliberative alignment trains o-series models to reason explicitly over safety specifications, enhancing response safety and policy adherence without requiring human-labeled chains-of-thought, with o1 surpassing GPT-4o and other leading models in safety benchmarks, resisting jailbreaks, and reducing overrefusals

Comments

One response to “12 Days of OpenAI”

  1. A WordPress Commenter Avatar

    Hi, this is a comment.
    To get started with moderating, editing, and deleting comments, please visit the Comments screen in the dashboard.
    Commenter avatars come from Gravatar.

Leave a Reply

Your email address will not be published. Required fields are marked *