Anthropic—Updated Responsible Scaling Policy with enhanced capability thresholds and safety case methodologies

Oct 15, 2024

Anthropic published major update to RSP adding new capability thresholds, refined evaluation processes inspired by safety case methodologies, and measures for internal governance and external input. Established specific thresholds for ASL-3 (CBRN assistance) and ASL-4 (autonomous AI research).

Scoring Impact

Topic	Direction	Relevance	Contribution
AI Safety	+toward	primary	+1.00
Overall incident score =			+0.590

Score = avg(topic contributions) × significance (medium ×1) × confidence (0.59)

Evidence (1 signal)

Confirms Policy Change Oct 15, 2024 verified

Anthropic published RSP v2.2 with enhanced capability thresholds on October 15, 2024

On October 15, 2024, Anthropic published a major update to its Responsible Scaling Policy (version 2.2) adding new capability thresholds for ASL-3 (CBRN weapons assistance) and ASL-4 (autonomous AI research), refined evaluation processes inspired by safety case methodologies, and enhanced internal governance. The update clarified ambiguous definitions and extended evaluation intervals to 6 months to avoid rushed assessments. Jared Kaplan was announced as new Responsible Scaling Officer.

Anthropic,Anthropic