#OpenAI

Key Findings from Anthropic × OpenAI Joint Safety Evaluation
Analysis of the joint AI safety evaluation conducted by OpenAI and Anthropic. Claude 4 shows strong performance in instruction hierarchy, while o3 and o4-mini excel in jailbreak resistance. Hallucination evaluation reveals Claude's cautious approach vs OpenAI's proactive stance.