
TILOS Seminar: Unlearnable Facts Cause Hallucinations in Pretrained Language Models
HDSI 123 and Virtual 3234 Matthews Ln, La Jolla, CA, United StatesAdam Tauman Kalai, OpenAI Abstract: Pretrained language models (LMs) tend to preserve many qualities present in their training data, such as grammaticality, formatting, and politeness. However, for specific types of factuality, even LMs pretrained on factually correct statements tend to produce falsehoods at high rates. We explain these “hallucinations” by drawing a connection to binary […]