My research
My more theoretical work is broadly defined as nonparametric regression. There are several Bristol statisticians interested in this research theme.
My main areas of interest are:
- Controlling local extreme values
- Regression on a graph
- Total variation penalties
- Image analysis
- Multiresolution
I also do more applied statistical work with dietary patterns, and epidemiology.
My areas of interest here are:
- Principal components analysis
- Cluster analysis
- Dietary patterns and fat/lean mass
- Multilevel modelling
Publications
Northstone, K., Smith, A.D.A.C., Newby P.K. & Emmett, P.M. Longitudinal comparisons of dietary patterns derived by cluster analysis in 7- to 13-year-old children, British Journal of Nutrition, in press.
doi:10.1017/S0007114512004072
Smith, A.D.A.C., Emmett, P.M., Newby P.K. & Northstone, K. Dietary patterns obtained through principal components analysis: the effect of input variable quantification, British Journal of Nutrition, in press. doi:10.1017/S0007114512003868
Smith, A.D.A.C., Emmett, P.M., Newby P.K. & Northstone, K. (2011).
A comparison of dietary patterns derived by cluster and principal components analysis in a UK cohort of children,
European Journal of Clinical Nutrition, 65, p.1102-9.
Kovac, A. & Smith, A.D.A.C. (2011).
Nonparametric regression on a graph,
Journal of Computational and Graphical Statistics, 20, p.432-47.
Smith, A.D.A.C. & Wand, M.P. (2008).
Streamlined variance calculations for semiparametric mixed models,
Statistics in Medicine, 27, p.435-48.
This is a preprint of an article published in
Statistics in Medicine.
Copyright © John Wiley & Sons, Ltd.
Invited talks
L1-penalised regression on a graph.
Statistics Seminar, Université catholique de Louvain.
abstract
The aim of nonparametric regression is to fit simple, or smooth, estimates to noisy data. One approach is penalised regression, which directly controls the tradeoff between two competing qualities of the fit: smoothness and fidelity to data. This talk focuses on estimates from total variation denoising, which penalises L1 measurements of roughness. In simple examples estimates may be computed by the taut string algorithm and smoothing parameters selected by the multiresolution criterion.
One way that the above topics may be generalised is to penalised regression on observations taken at the vertices of a graph, or network. Fidelity to the data is measured at the vertices, and roughness over the edges of the graph. There are a number of nonparametric regression problems that contain some sort of graphical structure and examples of scatterplot smoothing, image denoising, and mapping UK house prices will be given. There are computational challenges in implementing this penalised regression; we will discuss a new algorithm for regression on a graph.
A second way to generalise total variation denoising is to penalise the total variation of higher derivatives of the estimate, resulting in a smoother fit. Quadratic programming is a versatile tool for calculating the estimate and automatically selecting smoothing parameters.
Penalised regression on a graph.
Statistics Seminar, University of Birmingham.
abstract
There are a number of statistical models that contain some sort of graphical structure.
For example: scatterplot smoothing, image analysis, disease risk mapping, and spatial and longitudinal models.
We will discuss penalised regression on observations made on a graph.
Regression on a graph requires the fitting of a function that somehow explains the observations made at the vertices.
The edges of the graph describe which observations are 'close together' or neighbours.
The fitted function, and shape of the graph, may be completely arbitrary, so it is appropriate to use nonparametric regression.
Our approach may be seen as a generalisation of total variation denoising, in which the smoothness or simplicity of the function is controlled. We can penalise total variation of higher derivatives to obtain smooth fits, and extend it to two dimensions to tackle, for example, image analysis.
There are computational challenges in implementing this penalised regression; we discuss quadratic programming and see the results of a new algorithm for regression on a graph. There will be some examples including image analysis, mapping UK house prices, and automatic smoothing parameter selection.
Closing Remarks. MINGLE 2012.
Describing childhood diet with cluster analysis.
RSS 2012.
abstract
Objectives
Diet is notoriously complicated to record and quantify. However, it is a vital component in the development of many chronic diseases including cancer, cardiovascular disease and diabetes. Many studies of diet have focussed on the intake of individual nutrients, but it is becoming increasingly recognised that people eat foods rather than nutrients. Moreover, people eat foods in combination and therefore large correlations exist between these individual foods and nutrients. Analysing patterns of diet enable us to examine diet as a whole taking into account these correlations and similarities in foods eaten together.
Method/Models
This talk will give an introduction to dietary pattern analysis, using data collected from children in the Avon Longitudinal Study of Parents and Children (
www.bristol.ac.uk/alspac) as an example. Dietary data has been collected via food frequency questionnaires and diet diaries, using over 90 food groups, so data reduction techniques such as principal components and cluster analysis are therefore appropriate methods to use to describe underlying dietary patterns in the data. We will focus on cluster analysis, in particular k-means clustering - the most popular method of cluster analysis in the dietary patterns literature.
Results and Conclusions
Cluster analysis of data from food frequency questionnaires yielded 3 clusters (describing a Processed diet, a 'Healthy' or Plant-based diet and a Traditional British diet) and cluster analysis of data from diet diaries yielded 4 clusters (the extra cluster describing a Packed lunch diet).
There are a number of potential pitfalls in applying k-means cluster analysis: care must be taken when standardising the input variables, and the standard algorithm is not always reliable. The talk will explain these problems and offer appropriate solutions.
Total variation on a graph.
Recent Developments in Statistical Multiscale Methods.
abstract
We consider the problem of nonparametric regression where the aim is to approximate noisy observations by simple functions. In particular we generalise total variation denoising from the common situation of regression in one dimension to the problem of regression on graphs. Here the observations lie on the knots of a graph and instead of covariates there is a graphical structure which determines which observations are close to each other. Our new generalised version of TV denoising penalises the distance between the data and the fit on the knots as well as the total variation along the edges of the graph. We show that the solution can be viewed as the generalisation of the taut string to graphs.
Generalisations of penalised regression with the Manhattan norm.
Statistics seminar, University of Bristol.
abstract
The simplest application of nonparametric regression is to find patterns in data, without prior information about the shape of those patterns. One of the most straightforward, but computationally challenging, techniques is penalised regression, in which the pattern detected is seen as a trade-off between simplicity and fidelity to the data. It can be advantageous to use penalty terms based on the L1 or Manhattan norm, especially if there may be discontinuities in the pattern, for example in image denoising.
There are a number of nonparametric regression problems that contain some sort of graphical structure. For example: scatterplot smoothing, image denoising, and spatial and longitudinal models. We will discuss penalised regression on observations made on a graph. Fidelity to the data is measured at the vertices, and roughness at the edges of the graph. This approach may be seen as a generalisation of total variation denoising.
There are computational challenges in implementing this penalised regression; We will see estimates calculated by a new algorithm for regression on a graph, and discuss potential extensions to this algorithm and the relative merits of quadratic programming. There will be some examples including image denoising, mapping UK house prices, and automatic smoothing parameter selection.
Dietary patterns in the ALSPAC cohort: Cluster analysis.
EUCCONET.
abstract
Principal components analysis (PCA) is an established method for extracting dietary patterns from food frequency questionnaires (FFQs), food recall data (e.g. 24 hour recall questionnaires) and dietary diaries. PCA condenses the many variables present in dietary data into a small number of components that describe diet. An alternative method for extracting dietary patterns is cluster analysis, which groups individuals with similar diets into non-overlapping clusters. Both PCA and cluster analysis have been used to extract and describe dietary patterns in many studies.
Several studies have used PCA to extract dietary patterns from FFQs administered to women and children in the Avon Longitudinal Study of Parents and Children (ALSPAC). However, cluster analysis has only recently been used to extract dietary patterns from the same data. This talk will present results from cluster analysis in the ALSPAC cohort, and make comparisons with results from PCA in the same children.
In FFQ data from 8279 children aged 7 years, 3 clusters were found (Processed, Plant-based and Traditional), describing dietary patterns that broadly correspond to the 3 principal components extracted, using PCA, from the same data. These clusters demonstrate associations with sociodemographic variables that are generally similar to the associations between principal components and sociodemographic variables.
This talk will also briefly present results from cluster analysis in dietary diary data, from 7473 children aged 10 years, and discuss associations with obesity-related outcomes such as fat mass. The relative merits of cluster analysis and PCA will also be considered in these applications.
Quadratic Programming and Penalised Regression.
17th EYSM.
abstract
Regression is an appropriate way to make inferences about the relationship between two or more variables.
If this is more complicated than a simple functional relationship then nonparametric regression is necessary.
Even the most straightforward case of nonparametric regression between two continuous variables is not an easy problem, and remains far from being 'solved'.
There are many different alternative methods for nonparametric regression in this case.
It is a good idea to try several difference methods of nonparametric regression when approaching new data, but this requires the researcher to have knowledge of, and access to, many different methods.
Penalised regression is a broad category of nonparametric regression methods.
Restricting the methods used, when approaching new data, to penalised regression offers a good compromise, as the methods of penalised regression offer estimates with a range of different qualities but require less extensive knowledge of nonparametric regression methods.
Penalised regression applies a penalty term, which measures the roughness of the estimate, to the likelihood in order to achieve an estimate that is smoother than the noisy data.
The nature of the roughness penalty, and whether it is measured in the L1 or L2 norm, affects the quality of the estimate.
Quadratic programming is a versatile tool for calculating estimates based on penalised regression.
It can be used to produce estimates based on L1 roughness penalties (nonparametric lasso, total variation denoising), L2 roughness penalties (spline smoothing, ridge regression) and both roughness penalties (nonparametric elastic net).
In particular, quadratic programming can calculate estimates when the roughness penalty is a higher-order total variation, for instance the total variation of the first derivative.
It can also calculate estimates based on two roughness penalties, combining the qualities of both.
As with other methods of nonparametric regression, penalised regression requires the selection of one or more smoothing parameters.
A multiresolution criterion may be used to judge the suitability of smoothing parameters.
Multiresolution may be included in the quadratic program that is used to calculate the estimate in order to automatically select smoothing parameters.
Penalised regression with the Manhattan norm.
Bristol-Bath postdoc talks, University of Bath.
abstract
This talk will look at two aspects of penalised nonparametric regression: regression on a graph and total variation denoising. Both use the Manhattan norm as a roughness penalty.
There are a number of statistical models that contain some sort of graphical structure. For example: scatterplot smoothing, image analysis, disease risk mapping, and spatial and longitudinal models. We will discuss penalised regression on observations made on a graph.
Regression on a graph requires the fitting of a function that somehow explains the observations made at the vertices. The edges of the graph describe which observations are 'close together' or neighbours. The fitted function, and shape of the graph, may be completely arbitrary, so it is appropriate to use nonparametric regression. We will specifically discuss penalised regression, measuring the distance from data at the vertices, and roughness at the edges of the graph.
Our approach may be seen as a generalisation of total variation denoising, in which the smoothness or simplicity of the function is controlled. We can penalise total variation of higher derivatives to obtain smooth fits, and extend it to two dimensions to tackle, for example, image analysis. There are computational challenges in implementing this penalised regression; we discuss quadratic programming and see the results of a new algorithm for regression on a graph. There will be some examples including image analysis, mapping UK house prices, and automatic smoothing parameter selection.
Nonparametric regression with the Manhattan norm.
Bristol-Oxford postdoc seminar, University of Oxford.
abstract
This talk will look at two aspects of penalised nonparametric regression: regression on a graph and total variation denoising. Both use the Manhattan norm as a roughness penalty.
There are a number of statistical models that contain some sort of graphical structure. For example: scatterplot smoothing, image analysis, disease risk mapping, and spatial and longitudinal models. We will discuss penalised regression on observations made on a graph.
Regression on a graph requires the fitting of a function that somehow explains the observations made at the vertices. The edges of the graph describe which observations are 'close together' or neighbours. The fitted function, and shape of the graph, may be completely arbitrary, so it is appropriate to use nonparametric regression. We will specifically discuss penalised regression, measuring the distance from data at the vertices, and roughness at the edges of the graph.
Our approach may be seen as a generalisation of total variation denoising, in which the smoothness or simplicity of the function is controlled. We can penalise total variation of higher derivatives to obtain smooth fits, and extend it to two dimensions to tackle, for example, image analysis. There are computational challenges in implementing this penalised regression; we discuss quadratic programming and see the results of a new algorithm for regression on a graph. There will be some examples including image analysis, mapping UK house prices, and automatic smoothing parameter selection.
Denoising UK House Prices.
RSS 2010.
abstract
Objectives
The British people are obsessed with house prices. There is considerable interest in the
difference in price between different areas and in different years. This talk will attempt to show
a smooth national trend in house prices, in both space and time.
Method/Models
We will look at noisy data, provided by Halifax, on UK house prices and discuss it as a
particular example of regression on a graph. There are considerable challenges in the data,
most notably the lack of covariate values and missing observations, that make existing
regression methods fail.
Results and Conclusions
Regression on a graph is a new technique that estimates a denoised version of observations
made at the vertices of a graph. It is a type of penalised regression, in which distance from
data is penalised at all the vertices, and roughness at all the edges of the graph. These
penalty terms present computational challenges, so we will see the result of a new, fast
algorithm for regression on a graph.
Regression on a Graph.
Statistics seminar, Cardiff University.
abstract
There are a number of statistical models that contain some sort of graphical structure.
For example: scatterplot smoothing, image analysis, disease risk mapping, and spatial and longitudinal models.
We will discuss penalised regression on observations made on a graph.
Regression on a graph requires the fitting of a function that somehow explains the observations made at the vertices.
The edges of the graph describe which observations are 'close together' or neighbours.
The fitted function, and shape of the graph, may be completely arbitrary, so it is appropriate to use nonparametric regression.
Our approach may be seen as a generalisation of total variation denoising, in which the smoothness or simplicity of the function is controlled. We can penalise total variation of higher derivatives to obtain smooth fits, and extend it to two dimensions to tackle, for example, image analysis.
There are computational challenges in implementing this penalised regression; we discuss quadratic programming and see the results of a new algorithm for regression on a graph. There will be some examples including image analysis, mapping UK house prices, and automatic smoothing parameter selection.
Nonparametric Regression on a Graph.
RSS 2009.
abstract
There are a number of statistical models that contain some sort of graphical structure.
For example: scatterplot smoothing, image analysis, disease risk mapping, and spatial and longitudinal models.
We will discuss penalised regression in the context of removing noise from observations made on a graph.
Regression on a graph requires the fitting of a function that somehow explains the observations made at the vertices.
The fitted function, and shape of the graph, may be completely arbitrary.
Therefore it is appropriate to use nonparametric regression, which makes less restrictive assumptions about the function.
Our approach may be seen as a generalisation of total variation denoising, in which the smoothness or simplicity of the function is controlled.
The generalised method penalises departures from the data on the vertices, and roughness on the edges of the graph.
There are computational challenges in implementing this penalised regression.
We will see the results of a new active set algorithm for denoising on a graph, and discuss some applications including image analysis and mapping UK house prices.
Contributed talks
Describing childhood diet with cluster analysis.
YSM 2011, awarded first prize for best talk.
abstract
Diet is notoriously complicated to record and quantify. However, it is a vital component in the development of many chronic diseases including cancer, cardiovascular disease and diabetes. Many studies of diet focus on the intake of individual nutrients. However, it is becoming increasingly recognised that people eat foods rather than nutrients. Moreover, people eat foods in combination and therefore large correlations exist between these individual foods and nutrients. Analysing patterns of diet enable us to examine diet as a whole taking into account these correlations and similarities in foods eaten together.
This talk will give an introduction to dietary pattern analysis, using data collected on 7-year-old children in the ALSPAC cohort (
www.bris.ac.uk/alspac) as an example. Dietary data has been collected via food frequency questionnaires, using over 90 questions, so data reduction techniques such as principal components and cluster analysis are therefore appropriate methods to use to describe underlying dietary patterns in the data. We will focus on cluster analysis, in particular k-means clustering - the most popular method of cluster analysis in the dietary patterns literature, and present the dietary patterns that it describes in these children. There are a number of potential pitfalls in applying k-means cluster analysis: care must be taken when standardising the input variables, and the standard algorithm is not always reliable. The talk will explain these problems and offer appropriate solutions.
Nonparametric Regression on a Graph.
Statworks.
abstract
We will look at the problem of nonparametric regression in the context of removing noise from observations taken at the vertices of a graph. So rather than making inferences about distributions on the edges, we make estimates and inferences at all the vertices. There are many existing regression situations that contain a graphical structure, and we will consider examples of scatterplot smoothing, image analysis and denoising UK house price data.
Regression on a graph involves fitting an estimate that somehow explains the observations, some of which may be missing, taken at the vertices. Nonparametric regression is more appropriate since the underlying trend in the observations, and the graph itself, may be completely arbitrary. The edges of the graph provide information about the distance between observations, and in some applications this is the only such information available.
Borrowing ideas from penalised regression and total variation denoising, we penalise distance from data on the vertices, and roughness on the edges, measured in the L2 and L1 norms respectively. There are computational challenges associated with implementing these penalty terms, so we will see the results of a new, fast algorithm for denoising on a graph.
The presence of graphical structures in regression problems is often surprising, and may suggest some new insights into the connections between networks and other statistical models. The algorithm can also be adapted to identify clusters of neighbouring vertices, tackle classification problems, and identify important edges in penalised regression.
Penalised Regression on a Graph.
Sustain Sparsity Workshop.
abstract
Nonparametric regression means taking observations with complicated structure and fitting simpler estimates to them. In many types of regression `simpler' means `smoother' but in some types of penalised regression, in particular total variation denoising, `simpler' means `sparser'.
We will briefly consider how total variation denoising may be used to find a sparse structure in one-dimensional regression, before generalising it to perform regression on a graph. This is a new type of regression that fits an estimate to observations made at the vertices of a graph. It can be employed when there are no covariate values but there is a graphical structure that suggests which observations are near to each other. With appropriate penalty terms it may be thought of as a generalisation of the nonparametric lasso, and can be used to detect sparse structures in, among other examples, images and UK house price data.
Our generalised version of total variation denoising penalises distance from the data on the vertices, and roughness on the edges of the graph. There are computational challenges in the implementation, so we will see the results of a new, fast algorithm for regression on a graph, and discuss some examples.
Nonparametric Regression on a Graph.
NSSL 2010.
abstract
The 'Signal plus Noise' model for nonparametric regression can be extended to the case of observations taken at the vertices of a graph. This model includes many familiar regression problems. This talk discusses the use of the edges of a graph to measure roughness in penalized regression.
Distance between estimate and observation is measured at every vertex in the L2 norm, and roughness is penalized on every edge in the L1 norm. Thus the ideas of total-variation penalization can be extended to a graph.
This presents computational challenges, so we present a new, fast algorithm and demonstrate its use with examples, including denoising of noisy images, a graphical approach that gives an improved estimate of the baseline in spectroscopic analysis, and regression of spatial data (UK house prices).
Denoising UK House Prices.
33rd RSC, awarded first prize for best talk.
abstract
The British people are obsessed with house prices.
There is considerable interest in the difference in price between different areas and in
different years.
This talk will attempt to show a smooth national trend in house prices, in both space and time.
We will look at noisy data, provided by Halifax, on UK house prices and discuss it as a particular example of regression on a graph.
There are considerable challenges in the data, most notably the lack of covariate values and missing observations, that make existing regression methods fail.
Regression on a graph is a new technique that estimates a denoised version of observations made at the vertices of a graph.
It is a type of penalised regression, in which distance from data is penalised at all the vertices,
and roughness at all the edges of the graph.
These penalty terms present computational challenges, so we will see the result of a new, fast
algorithm for regression on a graph.
video 1
video 2
Denoising UK House Prices.
YSM 2010, awarded third prize for best talk.
abstract
The British people are obsessed with house prices.
There is considerable interest in the difference in price between different areas and in
different years.
This talk will attempt to show a smooth national trend in house prices, in both space and time.
We will look at noisy data, provided by Halifax, on UK house prices and discuss it as a particular example of regression on a graph.
There are considerable challenges in the data, most notably the lack of covariate values and missing observations, that make existing regression methods fail.
Regression on a graph is a new technique that estimates a denoised version of observations made at the vertices of a graph.
It is a type of penalised regression, in which distance from data is penalised at all the vertices,
and roughness at all the edges of the graph.
These penalty terms present computational challenges, so we will see the result of a new, fast
algorithm for regression on a graph.
video 1
video 2
Regression on a Graph.
MINGLE 2009.
abstract
A number of mathematical problems, including many statistical models, contain some sort of graphical structure.
Some examples are: scatterplot smoothing, spatial and longitudinal models, disease risk mapping and image analysis.
Regression on a graph means finding an estimate that somehow explains observations made at the vertices.
We will discuss penalised regression in this context.
This is computationally challenging so we will see the output of a new active set algorithm, and discuss some examples including image analysis and mapping UK house prices.
Nonparametric Regression on a Graph.
32nd RSC, awarded second prize for best talk.
abstract
A number of problems in penalised regression may be thought of as having a graphical structure.
These range from straightforward scatterplot smoothing, to more complicated penalties in image analysis.
Some spatial and longitudinal models also contain a graphical structure.
We will discuss nonparametric regression in the context of removing noise from observations made on a graph.
Regression on a graph requires the fitting of a function that somehow explains the observations made at the vertices.
The fitted function, and shape of the graph, may be completely arbitrary.
Therefore it is appropriate to use nonparametric regression, which makes less restrictive assumptions about the function.
We borrow ideas from total variation denoising, in which the smoothness or simplicity of the function is controlled.
The generalised method penalises departures from the data on the vertices, and roughness on the edges of the graph.
There are computational challenges in implementing this penalised regression.
We will see the results of a new active set algorithm for denoising on a graph, and discuss some applications including image analysis.
Image Analysis and Total Variation Denoising
31st RSC.
abstract
Nonparametric regression refers to the fitting of functions to data, without making restrictive assumptions about their shape.
Often smoothness or simplicity conditions are imposed instead.
One such measure of simplicity is the number of local extreme value exhibited by the fitted function.
This number can be controlled by penalising the total variation of the function.
As we will see, quadratic programming is a useful tool for solving regression problems with this type of penalty term.
As well as ease of implementation, they provide a framework to which other constraints, such as multiresolution, may be added.
The dual quadratic program provides some interesting insights and leads to the fast Taut String algorithm for solving this kind of problem.
It is also easy to include total variation penalties of higher derivatives in order to obtain smooth fits.
Image analysis can be thought of as a special case of nonparametric regression in two dimensions.
The methods and theory of quadratic programming can also be extended to this problem.
However there is a high computational cost, which is the main challenge of this type of image analysis.
Image Analysis and Total Variation Denoising
YES-I.
abstract
We look at the idea of total variation penalties in one and two dimensions, and see how quadratic programming is a useful tool for regression according to these penalties.
We will discuss how multiresolution can be integrated into the quadratic programs, and see the advantages of using the dual quadratic program to improve efficiency.
Conferences attended
| Oct 2012 |
MINGLE 2012, Bristol. |
| Sep 2012 |
RSS Conference 2012, Telford. |
| Jul 2012 |
Workshop on Recent Developments in Statistical Multiscale Methods, Georg-August-Universität Göttingen.
|
| Oct 2011 |
Nutrition resources in longitudinal studies; what can we learn from each other? EUCCONET International Workshop, in Bristol. |
| Sep 2011 |
17th European Young Statisticians Meeting, FCT, UNL, Lisbon. |
| Apr 2011 |
Young Statisticians' Meeting 2011, Southampton. |
| Sep 2010 |
RSS Conference 2010, Brighton. |
| June 2010 |
Statistical modelling and inference for networks, organised by SuSTaIn, in Bristol. |
| June 2010 |
Sparse structures: statistical theory and practice, organised by SuSTaIn, in Bristol. |
| May 2010 |
Conference on Nonparametric Statistics and Statistical Learning (NSSL 2010), Ohio State University. |
| April 2010 |
33rd Research Students' Conference in Probability and Statistics, University of Warwick. |
| Mar 2010 |
Young Statisticians' Meeting 2010, University of Liverpool. |
| Oct 2009 |
MINGLE 2009, Bristol. |
| Sep 2009 |
RSS Conference 2009, Edinburgh. |
| Mar 2009 |
32nd Research Students' Conference in Probability and Statistics, Lancaster University. |
| June 2008 |
Workshop on Nonparametric Inference (WNI2008), Coimbra, Portugal |
| April 2008 |
31st Research Students' Conference in Probability and Statistics, University of Nottingham. |
| Nov 2007 |
Modern challenges of curve modelling: inverse problems and qualitative constraints, organised by SuSTaIn, in Bristol. |
| Oct 2007 |
Young European statisticians Workshop
(YES-I)
on 'Shape Restricted Inference', EURANDOM, Eindhoven, Netherlands. |