Simulated Annealing Enhances Theory-of-Mind Reasoning in Autoregressive Language Models
arXiv:2601.12269v1 Announce Type: new Abstract: Autoregressive language models are next-token predictors and have been criticized for only optimizing surface plausibility (i.e., local coherence) rather than maintaining correct latent-state representations (i.e., global coherence). Because Theory of Mind (ToM) tasks crucially depend on reasoning about latent mental states of oneself and others, such models are therefore often thought to fail at ToM. While post-training methods can improve ToM performance, we show that strong ToM capability can be recovered directly from […]