Teaching LLMs to reason like Bayesians

Evaluating LLMs’ Bayesian capabilities

As with humans, to be effective, an LLM’s user interactions require continual updates to its probabilistic estimates of the user’s preferences based on each new interaction with them. Here we ask: do LLMs act as if they have probabilistic estimates that are updated as expected from optimal Bayesian inference? To the extent that the LLM’s behavior deviates from the optimal Bayesian strategy, how can we minimize these deviations?

To test this, we used a simplified flight recommendation task, in which the LLMs interact as assistants with a simulated user for five rounds. In each round, three flight options were presented to both the user and the assistant. Each flight was defined by a departure time, a duration, a number of stops, and a cost. Each simulated user was characterized by a set of preferences: for each feature, they could have a strong or weak preference for high or low values of the feature (e.g., they may prefer longer or shorter flights), or no preference regarding this feature.

We compared the LLMs’ behavior to that of a model, a Bayesian assistant, that follows the optimal Bayesian strategy. This model maintains a probability distribution that reflects its estimates of the user’s preferences, and uses Bayes’ rule to update this distribution as new information about the user’s choices becomes available. Unlike many real-life scenarios, where it’s difficult to specify and implement the Bayesian strategy computationally, in this controlled setting it’s easy to implement and allows us to precisely estimate the extent to which LLMs deviate from it.

The goal of the assistant was to recommend the flight that matches the user’s choice. At the end of each round, the user indicated to the assistant whether or not it chose correctly, and provided it with the correct answer.

Source link