Jul 14, 2024
Sorry, but this essay is just wrong. It’s been well understood that LLMs *do* know what is true. The issue is that they don’t know that they should respond with truth (they are not trained to do so by default). Secondary training stages take them a long way to biassing towards ‘true’ responses, however.