[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"branding":3,"analytics":7,"article-openai-anthropic-publish-joint-ai-safety-test-results":10},{"siteName":4,"siteTagline":5,"publisherName":4,"contactEmail":6},"The Revision","Tech news, decoded.","editor@therevision.news",{"gaMeasurementId":8,"adsenseClientId":9},"G-ZW2MV82GYR","ca-pub-8533917693782264",{"article":11},{"id":12,"slug":13,"title":14,"dek":15,"body_md":16,"tags_json":17,"published_at":18,"created_at":19,"updated_at":20,"status":21,"review_note":22,"review_notes":23,"image_url":22,"persona_id":22,"persona_name":22,"section":22,"tags":33,"sources":36,"feedback":40,"feedback_at":22,"cost_usd":40,"total_tokens":40},1112,"openai-anthropic-publish-joint-ai-safety-test-results","OpenAI, Anthropic publish joint AI safety test results","The two labs released a joint evaluation that pits their models against each other to surface alignment gaps and highlight the benefits of shared safety work.","- OpenAI and Anthropic have released a joint safety evaluation that pits each other's flagship models against a battery of tests for misalignment, instruction following, hallucinations, and jailbreak resistance.\n\nThe report details how each model performed on identical prompts, noting where one succeeded and the other fell short. Both labs point to incremental improvements over their previous internal tests, but also flag persistent failure modes that survived the cross‑lab scrutiny. The findings are presented as a single‑source document, with data tables and qualitative analysis from both teams.\n\nThe collaboration matters because it provides a rare, head‑to‑head benchmark that is difficult to fabricate. Independent cross‑testing forces labs to expose blind spots that internal audits often miss, offering the broader community a clearer view of current alignment limits. It also signals that competitors are willing to share metrics rather than hoard safety claims, a trend that could accelerate collective progress.\n\nIn short, the joint evaluation shows modest gains in robustness while underscoring that major alignment challenges remain, and it sets a precedent for more open safety benchmarking across the AI industry.","[\"ai-safety\",\"large-language-models\"]","2025-08-27T10:00:00.000Z","2026-06-16T10:45:39.843Z","2026-06-16T10:45:42.676Z","published",null,[24,30],{"id":25,"reviewer":26,"round":27,"reason":28,"status":29},"editor-r1","editor",1,"Add a concise concluding paragraph that sums up the news and its implications, rather than ending on an open question.","resolved",{"id":31,"reviewer":26,"round":32,"reason":28,"status":29},"editor-r2",2,[34,35],"ai-safety","large-language-models",[37],{"name":38,"url":39},"OpenAI","https:\u002F\u002Fopenai.com\u002Findex\u002Fopenai-anthropic-safety-evaluation",0]