Anthropic—Updated Responsible Scaling Policy with enhanced capability thresholds and safety case methodologies
Anthropic published major update to RSP adding new capability thresholds, refined evaluation processes inspired by safety case methodologies, and measures for internal governance and external input. Established specific thresholds for ASL-3 (CBRN assistance) and ASL-4 (autonomous AI research).
Scoring Impact
| Topic | Direction | Relevance | Contribution |
|---|---|---|---|
| AI Safety | +toward | primary | +1.00 |
| Overall incident score = | +0.590 | ||
Score = avg(topic contributions) × significance (medium ×1) × confidence (0.59)
Evidence (1 signal)
Anthropic published RSP v2.2 with enhanced capability thresholds on October 15, 2024
On October 15, 2024, Anthropic published a major update to its Responsible Scaling Policy (version 2.2) adding new capability thresholds for ASL-3 (CBRN weapons assistance) and ASL-4 (autonomous AI research), refined evaluation processes inspired by safety case methodologies, and enhanced internal governance. The update clarified ambiguous definitions and extended evaluation intervals to 6 months to avoid rushed assessments. Jared Kaplan was announced as new Responsible Scaling Officer.