UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Bayesian regression and discrimination with many variables.

Chang, Kai-Ming; (2001) Bayesian regression and discrimination with many variables. Doctoral thesis (Ph.D), UCL (University College London). Green open access

[img] Text

Download (10MB)


This thesis attempts to provide general procedures for Bayesian regression and discriminant analysis with many variables and explore potential problems in the analysis. For regression analysis, a normal random regression model is assumed, i.e. the joint distribution of the response variables and the regressors is multivariate normal given their means and covariance matrix. For the discriminant analysis, we consider the case that each observation is from one of several multivariate normal populations. In classical statistics, the problem in fitting a multivariate model with more variables than the number of observations is that the estimate of the covariance matrix of the multivariate normal distribution is singular and the distribution is degenerate. In Bayesian statistics, this problem can be avoided by using proper prior assumptions for the covariance matrix. We assign an inverse-Wishart distribution (which is a conjugate prior in the case of a non-hierarchical analysis) for the covariance matrix and suppose the prior expected covariance matrix has a simple structure so that the number of hyperparameters required in the model is small. Hierarchical modelling of these hyperparameters is employed. Although we have managed to keep the model relatively simple with our strong assumptions, the posterior model is still complicated. We found ARMS within Gibbs sampling with multiple chains to be an appropriate MCMC strategy for fitting our models. Convergence checking for multiple chains MCMC is simple. Due to the ill-condition of the sample covariance matrix and the large number of variables, the computational problems are significant. Appropriate matrix manipulating and rescaling techniques are required. Two practical cases are considered as examples, one for regression and the other for discrimination. Both cases involve NIR spectral data with many variables. The high correlation between variables makes the examples more challenging. We consider three correlation structures including the over-simplified identity structure and two autoregressive correlation functions, which are believed to be much closer to the real situation than the over-simplified one. However, we found the autoregressive correlation functions do not guarantee better predictions in our examples.

Type: Thesis (Doctoral)
Qualification: Ph.D
Title: Bayesian regression and discrimination with many variables.
Open access status: An open access version is available from UCL Discovery
Language: English
Additional information: Thesis digitised by ProQuest.
URI: https://discovery.ucl.ac.uk/id/eprint/10105107
Downloads since deposit
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item