A Notion of Complexity for
Theory of Mind
via Discrete World Models

University of Bologna1 University of Oxford2
The Alan Turing Institute3 University of Leeds4
Preprint
*First author, work done while visiting University of Oxford
Logo Dino

Abstract

Theory of Mind (ToM) can be used to assess the capabilities of Large Language Models (LLMs) in complex scenarios where social reasoning is required. While the research community has proposed many ToM benchmarks, their hardness varies greatly, and their complexity is not well defined. This work proposes a framework inspired by cognitive load theory to measure the complexity of ToM tasks. We quantify a problem's complexity as the number of states necessary to solve it correctly. Our complexity measure also accounts for spurious states of a ToM problem designed to make it apparently harder. We use our method to assess the complexity of five widely adopted ToM benchmarks. On top of this framework, we design a prompting technique that augments the information available to a model with a description of how the environment changes with the agents' interactions. We name this technique Discrete World Models (DWM) and show how it elicits superior performance on ToM tasks.

Methods

We propose a method a to evaluate ToM problems along side a prompting method to elicit stateful reasoning of LLMs.
Complexity vs error rate
How statefulness and statelessness are computed. For \(obj_1\), an optimal split to track the apple merges the first two states and chunks of the input prompt. For \(obj_2\), which involves the \(1^{\text{st}}\)-order belief of Bob, the statefulness is higher, with \(e_2\) that cannot be merged with \(e_3\) as it introduces partial observability. The complexity of the task (bottom) is computed as in the paper, with the complexity of stateless objects that is discounted as not directly relevant to the question/answer.

Discrete World Models (DWM) splits a prompt into chunks and queries an LLM, at each timestep, to provide a concise representation of the environment.

Complexity vs error rate
Left: illustration of DWM prompting. We interactively prompt an LLM with a ToM problem, asking to provide a succinct representation of each agent's beliefs. Right: schematic presentation of the DWM method. We first break the input string into \(T\) state descriptions. Then, for each part, we ask the LLM to provide the state event of the environment and how it changes. In the last step, every part of the input and description is fed to the LLM with another prompt to get the answer for the task.

Observations

DWM makes crucial information in a prompt explicit

Comparison between dwm and cot
Example of a a ToMI instance where GPT-4 fails when prompted with CoT, yet succeeds with DWM. CoT elicits an untruthful reasoning process (in red), while DWM correctly informs the model with the correct information about Benjamin’s first-order belief (in green).

The complexity framework fits in Sweller's Cognitive Load theory

Sweller's theory

Results

We show that the complexity correlates with the error rate of the models on the ToM tasks.

Complexity vs error rate
Each boxplot summarizes the complexity analysis of the five ToM benchmarks in ascending order. We report the average error rate (i.e., 1-accuracy) of GPT-3.5-Turbo, GPT-4, Mixtral 8x7B and LLaMA3-70B on the task when prompted with CoT
Complexity vs error rate Complexity vs error rate
Results for GPT-3.5 and Mixstral

BibTeX


    @article{huang2024notion,
        title={A Notion of Complexity for Theory of Mind via Discrete World Models}, 
        author={X. Angelo Huang and Emanuele La Malfa and Samuele Marro and Andrea Asperti and Anthony Cohn and Michael Wooldridge},
        year={2024},
        eprint={2406.11911},
        archivePrefix={arXiv},
    }