Corney, David Peter Alfred;
(2002)
Intelligent Analysis of Small Data Sets for Food Design.
Doctoral thesis (Ph.D), UCL (University College London).
![]() |
Text
Intelligent analysis of small data sets for food design.pdf Download (11MB) |
Abstract
This thesis compares the performance of machine learning techniques and statistics in the analysis of food design data. The goal of the analysis is to understand what makes people like (or dislike) a product, by building models relating sensory features (such as flavour or texture) to consumer preferences. One difficulty in analysing these data sets is that they are extremely small, due to taste-fatigue of consumer preference panels. Feature selection is essential because food sensory data sets typically have many features and few records. Several feature selection algorithms are compared, and the results highlight the need to limit the number of features used. We therefore apply model order selection to feature selection. A semi-supervised feature selection method is introduced and compared with more traditional methods. After the selection of a suitable set of features, the relationship between those features and consumers preferences must be modelled. Two regression techniques are compared, focussing on their relative performance on very small data sets. A semi-supervised ensemble learning algorithm is introduced, and analysed. Consumers have individual preferences, so rather than producing a single generic product, food designers must first discover homogeneous groups of consumers, and then target each group with a different product. Several clustering techniques are compared, and consideration of their inherent biases reveals further information regarding the structure of the data. A combination of regression and clustering is proposed, which allows evaluation of clustering results using the predictive power of the resultant models. Preference data sets contain a significant number of misleading outliers owing to the way they are collected. An algorithm that combines clustering and outlier detection is introduced. Which aims to produce an outlier-free cluster model, and also provides heuristic estimates of the number of outliers present. Overall, machine learning techniques show performance similar to traditional statistical techniques, with small improvements in accuracy in some cases. Machine learning brings the benefit of typically being dependent on fewer assumptions: where these assumptions are invalid, results may be improved. Furthermore, machine learning makes use of considerable computational power, which is now cheaply available, in the search for improved solutions. In this thesis, we examine the efficacy of machine learning techniques when analyzing food design data sets. In summary, the main contributions of this thesis are: A semi-supervised feature selection algorithm. A semi-supervised ensemble for regression. A clustering evaluation technique. An outlier detection technique for clustering.
Type: | Thesis (Doctoral) |
---|---|
Qualification: | Ph.D |
Title: | Intelligent Analysis of Small Data Sets for Food Design |
Open access status: | An open access version is available from UCL Discovery |
Language: | English |
Additional information: | Thesis digitised by ProQuest |
Keywords: | Applied sciences; Food design data sets |
URI: | https://discovery.ucl.ac.uk/id/eprint/10099629 |
Archive Staff Only
![]() |
View Item |