OpenAI—OpenAI spent 6+ months on GPT-4 safety alignment, achieving 82% reduction in disallowed content and 40% more factual responses
OpenAI spent more than 6 months working across the organization to make GPT-4 safer and more aligned prior to public release. GPT-4 is 82% less likely to respond to requests for disallowed content and 40% more likely to produce factual responses than GPT-3.5 on OpenAI's internal evaluations. The company also published safety research for external review and led on risk assessments, conducting the only human participant bio-risk trials.
Scoring Impact
| Topic | Direction | Relevance | Contribution |
|---|---|---|---|
| AI Safety | +toward | primary | +1.00 |
| Corporate Transparency | +toward | secondary | +0.50 |
| Overall incident score = | +0.643 | ||
Score = avg(topic contributions) × significance (high ×1.5) × confidence (0.57)
Evidence (1 signal)
Confirms product_decision Mar 14, 2023 documented
OpenAI announced GPT-4 with 82% reduction in disallowed content after 6 months of alignment work