@OwainEvans_UK: GPT-4.1 reward hacking on harmless tasks led to misalignment by freeatnet.eth538 🥝 • 10mo • | |
Characters remaining: 10,000 comment guidelines | |

Scan with iPhone to join
TestFlight beta
@OwainEvans_UK: GPT-4.1 reward hacking on harmless tasks led to misalignment by freeatnet.eth538 🥝 • 10mo • | |
Characters remaining: 10,000 comment guidelines | |