Breadcrumb

David Leslie's research page

http://rsif.royalsocietypublishing.org/content/10/82/20130069.short
PDF document icon
Context-dependent decision-making: a simple Bayesian model. Kevin Lloyd and David S. Leslie. Journal of the Royal Society Interface, 10 (2013), 20130069.

Article on publisher website.

A model of individual learning in which the learner allocates observations to clusters in an online matter. Combines Dirichlet process clustering with online inference using single sample particle filters and Thompson sampling for action selection. The model exhibits plausibly realistic behaviour in serial reversal learning tasks, as well as spontaneous recovery, over-training reversal effects and partial reinforcement extinction effects.

PDF document icon
Asynchronous stochastic approximation with differential inclusions. Steven Perkins and David S. Leslie. Stochastic Systems, 2 (2012), 409-446.

Article on publisher website.

Uses the differential inclusions framework of stochastic approximation to consider the asynchronous updates problem, along with two-timescales techniques. The technique is demonstrated with an actor-critic method of learning in Markov decision processes.

PDF document icon
Respondent driven sampling and community structure in a population of injecting drug users, Bristol, UK. H.L. Mills, C. Colijn, P. Vickerman, D. Leslie, V. Hope and M. Hickman. Drug and Alcohol Dependence, 126 (2012), 324-332.

Article on publisher website.

An investigation of statistical properties of estimators when data are gathered using respondent-driven sampling.

PDF document icon
Optimistic Bayesian sampling in contextual-bandit problems. Benedict C. May, Nathan Korda, Anthony Lee and David S. Leslie. Journal of Machine Learning Research, 13 (2012), 2069-2106.

Article on publisher website.

Proves consistency of Thompson Sampling in contextual-bandit problems with consistent regression, and introduces a modification (optimistic Bayesian sampling, OBS) which is also provably consistent but outperforms Thompson sampling in empirical experiments.

PDF document icon
Information theory and observational limitations in decision making. David Wolpert and David S. Leslie. The B.E. Journal of Theoretical Economics, 12(1) (2012), Article 5.

Article on publisher website.

Considers the information available to a decision-maker to have been passed through an information channel, and considers the effect of this on observable decision-making. doi:10.1515/1935-1704.1749

PDF document icon
A unifying framework for iterative approximate best response algorithms for distributed constraint optimisation problems. Archie C. Chapman, Alex C. Rogers, Nicholas R. Jennings and David S. Leslie. The Knowledge Engineering Review, 26 (2011), 411-444.

Article on publisher website. Local copy (copyright Cambridge University Press, 2011)

Casts DCOPs as potential games, and therefore considers a large number of DCOP algorithms under one framework to allow useful comparison.

PDF document icon
Dynamic opponent modelling in fictitious play. Michalis Smyrnakis and David S. Leslie. The Computer Journal, 53 (2010), 1344-1359.

Download from publisher's website.

Uses particle filters to track and predict opponent strategy in for fictitious play. doi:10.1093/comjnl/bxq006

PDF document icon
Posterior weighted reinforcement learning with state uncertainty Tobias Larsen, David S. Leslie, E. J. Collins and Rafal Bogacz. Neural Computation, 22 (2010), 1149-1179.

Download from publisher's website.

Reinforcement learning of state values when ambiguity exists over which state a reward relates to.

PDF document icon
Nonparametric estimation of the distribution function in contingent valuation models. David S. Leslie, Robert Kohn, and Denzil G. Fiebig. Bayesian Analysis 4 (2009), 573-598.

Download preprint, published version

Places a Dirichlet process prior on the latent variable distribution of a binary regression model, so that the data determine the noise structure.

PDF document icon
Generalised linear mixed model analysis via sequential Monte Carlo sampling. Yanan Fan, David S. Leslie and Matt P. Wand. Electronic Journal of Statistics 2 (2008), 916-938.

Download preprint, published version

Uses a sequential Monte Carlo sampler to analyse generalised linear mixed models.

PDF document icon
On similarities between inference in game theory and machine learning. Iead Rezek, David S. Leslie, Steven Reece, Stephen J. Roberts, Alex C. Rogers, Rajeep K. Dash and Nicholas R. Jennings. Journal of Artificial Intelligence Research 33 (2008), 259-283

Download preprint. Official version on external website.

Introduces Bayesian decision making to fictitious play, giving "moderated fictitious play", and uses derivative action fictitious play to inform a variational learning procedure. Furthermore, discusses these two areas (learning in games and variational learning) in a common language.

PDF document icon
A general approach to heteroscedastic linear regression. David S. Leslie, Robert Kohn and David J. Nott. Statistics and Computing 17 (2007), 131-146.

Download preprint. The original publication is available at www.springerlink.com.

Uses a Dirichlet process prior on the noise in a heteroscedastic linear regression, resulting in a very general regression model.

PDF document icon
Generalised weakened fictitious play. David S. Leslie and E. J. Collins. Games and Economic Behavior 56 (2006), 285-298.

Download preprint. Official version on external website.

Studies a large class of learning algorithms based on fictitious play, using a unified convergence proof based on stochastic approximation.

PDF document icon
Individual Q-learning in normal form games. David S. Leslie and E. J. Collins. SIAM Journal on Control and Optimization 44 (2005), 495-514.

Download the published version.

This paper studies a simple temporal difference learning algorithm in repeated normal form games, using results on multiple-timescales stochastic approximation obtained in the previous paper

PDF document icon
Convergent multiple-timescales reinforcement learning algorithms in normal form games. David S. Leslie and E. J. Collins. Annals of Applied Probability 13 (2003), 1231-1251.

Download the published version.

This paper proves a result on multiple-timescales stochastic approximation, and uses it to investigate multi-agent actor-critic reinforcement learning.

PDF document icon
Reinforcement learning in games. Ph.D. Thesis. Supervisor: E. J. Collins.

Download PDF.

I was supported by CASE research studentship 00317214 from the UK Engineering and Physical Sciences Research Council in cooperation with BAE SYSTEMS.

As well as containing the foundations of the previous three papers, the thesis considers the stochastic approximation of an automata-based learning procedure, and some contraction properties of smooth best response operators.

Population-level reinforcement learning resulting in smooth best response dynamics. David S. Leslie and E. J. Collins. Technical report 02:13, Department of Statistics, University of Bristol.

This paper studies an evolutionary procedure that is closely related to the smooth best response dynamics.