Loading icon

RSS Items

1 - 10 of 54 results found

OpenAI has trained its LLM to confess to bad behavior

Date
Wednesday, December 03, 2025 - 10:01 PM
Description
OpenAI is testing another new way to expose the complicated processes at work inside large language models. Researchers at the company can make an LLM produce what they call a confession, in which the model explains how it carried out a task and